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Preface 


In the 1960s, there were a number of classic books written on quantum field theory. 
Because of the phenomenal experimental success of quantum electrodynamics 
(QED), quantum field theory became a rigorous body of physical knowledge, as 
established as nonrelativistic quantum mechanics. 

In the 1970s and 1980s, because of the growing success of gauge theories, 
it was clear that a typical 1-year course in quantum field theory was rapidly 
becoming obsolete. A number of advanced books appeared on various aspects of 
gauge theories, so often a 1-year course on quantum field theory became disjoint, 
with one book on QED being the basis of the first semester and one of several 
books on various aspects of gauge theories being the basis of the second semester. 

Today, because of the success of the Standard Model, it is necessary to con- 
solidate and expand the typical l-year quantum field theory course. There is 
obviously a need for a book for the 1990s, one that presents this material in a 
coherent fashion and uses the Standard Model as its foundation in the same way 
that earlier books used QED as their foundation. Because the Standard Model 
is rapidly becoming as established as QED, there is a need for a textbook whose 
focus is the Standard Model. 

As a consequence, we have divided the book into three parts, which can be 
used in either a two- or a three-semester format: 


I: Quantum Fields and Renormalization 
II: Gauge Theory and the Standard Model 


If: Non preturbative Methods and Unification 


Part I of this book summarizes the development of QED. It provides the foun- 
dation for a first-semester course on quantum field theory, laying the basis for 
perturbation theory and renormalization theory. (However, one may also use it 
in the last semester of a three-semester course on quantum mechanics, treating 
it as the relativistic continuation of a course on nonrelativistic quantum mechanics. 
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In this fashion, students who are not specializing in high-energy physics will find 
Part I particularly useful, since perturbation theory and Feynman diagrams have 
now penetrated into all branches of quantum physics.) 

In Part II, the Standard Model is the primary focus. This can be used as the 
basis of a second semester course on quantum field theory. Particular attention 
is given to the method of path integrals and the phenomenology of the Standard 
Model. This chapter is especially geared to students wanting an understanding 
of high-energy physics, where a working knowledge of the Standard Model is a 
necessity. It is hoped that the student will finish this section with an appreciation 
of the overwhelming body of experimental evidence pointing to the correctness 
of the Standard Model. 

Because the experiments necessary to go beyond the Standard Model are 
rapidly becoming prohibitively expensive and time consuming, we are also aware 
that the development of physics into the next decade may become increasingly 
theoretical, and therefore we feel that an attempt should be made to explore the 
various theories that take us beyond the Standard Model. 

Part III of this book, therefore, is geared to the students who wish to pursue 
more advanced material and can be used in one of two ways. A lecturer may want 
to treat a few of the chapters in Part III at the end of a typical two semester course 
on quantum field theory. Or, Part Il can be used as the basis of a third semester 
course. We are providing a variety of topics so that the lecturer may pick and 
choose the chapters that are most topical and are of interest. We have written Part 
III to leave as much discretion as possible for the lecturer in using this material. 

The approach that we have taken in our book differs from that taken in other 
books in several ways: 

First, we have tried to consolidate and streamline, as much as possible in a 
coherent fashion, a large body of information in one book, beginning with QED, 
leading to the Standard Model, and ending on supersymmetry. 

Second, we have emphasized the role of group theory, treating many of the 
features of quantum field theory as the byproduct of the Lorentz, Poincaré, and 
internal symmetry groups. Viewed in this way, many of the rather arbitrary and 
seemingly contrived conventions of quantum field theory are seen as a conse- 
quence of group theory. Group theory, especially in Part III, plays an essential 
role in understanding unification. 

Third, we have presented three distinct proofs of renormalization theory. Most 
books, if they treat renormalization theory at all, only present one proof. However, 
because of the importance of renormalization theory to today’s research, the 
serious student may find that a single proof of renormalization is not enough. 
The student may be ill prepared to handle research when renormalization theory 
is developed from an entirely different approach. As a consequence, we have 
presented three different proofs of renormalization theory so that the student can 
become fluent in at least two different methods. We have presented the original 
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Dyson/Ward proof in Chapter 7. In Part II, we also present two different proofs 
based on the BPHZ method and the renormalization group. 

Fourth, we should caution the reader that experimental proof of nonperturba- 
tive quark confinement or of supersymmetry is absolutely nonexistent. However, 
since the bulk of current research in theoretical high-energy physics is focused 
on the material covered in Part III, this section should give the student a brief 
Overview of the main currents in high-energy physics. Moreover, our attitude 
is to treat nonperturbative field theory and supersymmetries as useful theoretical 
“laboratories” in which to test many of our notions about quantum field theory. We 
feel that these techniques in Part III, if viewed as a rich, productive laboratory in 
which to probe the limits of field theory, will yield great dividends for the serious 
student. 

We have structured the chapters so that they can be adapted in many different 
ways to suit different needs. In Part I, for example, the heart of the canonical 
quantization method is presented in Chapters 3-6. These chapters are essential 
for building a strong foundation to quantum field theory and Feynman diagrams. 
Although path integral methods today have proven more flexible for gauge theo- 
ries, a student will have a much better appreciation for the rigor of quantum field 
theory by reading these chapters. Chapters 2 and 7, however, can be skipped 
by the student who either already understands the basics of group theory and 
renormalization, or who does not want to delve that deeply into the intricacies of 
quantum field theory. 

In Part II, the essential material is contained in Chapters 8—11. In these chap- 
ters, we develop the necessary material to understand the Standard Model, that is, 
path integrals, gauge theory, spontaneous symmetry breaking, and phenomenol- 
ogy. This forms the heart of this section, and cannot be omitted. However, 
Chapters 12-14 should only be read by the student who wants a much more de- 
tailed presentation of the subtleties of quantum field theory (BRST, anomalies, 
renormalization group, etc.). 

In Part III, there is great freedom to choose which material to study, depending 
on the person’s interests. We have written Part III to give the greatest flexibility 
to different approaches in quantum field theory. For those want an understanding 
of quark confinement and nonperturbative methods, Chapters 15—17 are essential. 
The student wishing to investigate Grand Unified Theories should study Chapter 
18. However, the student who wishes to understand some of the most exciting 
theoretical developments of the past decade should read Chapters 19-21. 

Because of the wide and often confusing range of notations and conventions 
found in the literature, we have tried to conform, at least in the early chapters, to 
those appearing in Bjorken and Drell, Itzykson and Zuber, and Cheng and Li. We 
also choose our metric to be g,,, = (+, —, —, —). 

We have also included 311 exercises in this book, which appear after each 
chapter. We consider solving these exercises essential to an understanding of the 
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material. Often, students complain that they understand the material but cannot 
do the problems. We feel that this is a contradiction in terms. If one cannot do the 
exercises, then one does not really fully understand the material. 

In writing this book, we have tried to avoid two extremes. We have tried to 
avoid giving an overly tedious treatise of renormalization theory and the obscure 
intricacies of Feynman graphs. One is reminded of being an apprentice during the 
Middle Ages, where the emphasis was on mastering highly specialized, arcane 
techniques and tricks, rather than getting a comprehensive understanding of the 
field. 

The other extreme is a shallow approach to theoretical physics, where many 
vital concepts are deleted because they are considered too difficult for the student. 
Then the student receives a superficial introduction to the field, creating confusion 
rather than understanding. Although students may prefer an easier introduction to 
quantum field theory, ultimately it is the student who suffers. The student will be 
totally helpless when confronted with research. Even the titles of the high-energy 
preprints will be incomprehensible. 

By taking this intermediate approach, we hope to provide the student with a 
firm foundation in many of the current areas of research, without overwhelming 
the student in an avalanche of facts. We will consider the book a success if we 
have been able to avoid these extremes. 


New York M. K. 
July 1992 
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Part I 


Quantum Fields 
and Renormalization 


Chapter 1 
Why Quantum Field Theory? 


Anyone who is not shocked by the quantum theory does not understand 
it. 
—N. Bohr 


1.1. Historical Perspective 


Quantum field theory has emerged as the most successful physical framework 
describing the subatomic world. Both its computational power and its conceptual 
scope are remarkable. Its predictions for the interactions between electrons and 
photons have proved to be correct to within one part in 10°. Furthermore, it can 
adequately explain the interactions of three of the four known fundamental forces 
in the universe. The success of quantum field theory as a theory of subatomic 
forces is today embodied in what is called the Standard Model. In fact, at present, 
there is no known experimental deviation from the Standard Model (excluding 
gravity). 

This impressive list of successes, of course, has not been without its problems. 
In fact, it has taken several generations of the world’s physicists working over 
many decades to iron out most of quantum field theory’s seemingly intractable 
problems. Even today, there are still several subtle unresolved problems about the 
nature of quantum field theory. 

The undeniable successes of quantum field theory, however, were certainly not 
apparent in 1927 when P.A.M. Dirac! wrote the first pioneering paper combining 
quantum mechanics with the classical theory of radiation. Dirac’s union of non- 
relativistic quantum mechanics, which was itself only 2 years old, with the special 
theory of relativity and electrodynamics would eventually lay the foundation of 
modern high-energy physics. 

Breakthroughs in physics usually emerge when there is a glaring conflict be- 
tween experiment and theory. Nonrelativistic quantum mechanics grew out of the 
inability of classical mechanics to explain atomic phenomena, such as black body 
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radiation and atomic spectra. Similarly, Dirac, in creating quantum field theory, 
realized that there were large, unresolved problems in classical electrodynamics 
that might be solved using a relativistic form of quantum mechanics. 

In his 1927 paper, Dirac wrote: “... hardly anything has been done up to the 
present on quantum electrodynamics. The questions of the correct treatment of 
a system in which the forces are propagated with the velocity of light instead of 
instantaneously, of the production of an electromagnetic field by a moving electron, 
and of the reaction of this field on the electron have not yet been touched.” 

Dirac’s keen physical intuition and bold mathematical insight led him in 1928 
to postulate the celebrated Dirac electron theory. Developments came rapidly 
after Dirac coupled the theory of radiation with his relativistic theory of the 
electron, creating Quantum Electrodynamics (QED). His theory was so elegant 
and powerful that, when conceptual difficulties appeared, he was not hesitant to 
postulate seemingly absurd concepts, such as “holes” in an infinite sea of negative 
energy. As he stated on a number of occasions, it is sometimes more important to 
have beauty in your equations than to have them fit experiment. 

However, as Dirac also firmly realized, the most beautiful theory in the world 
is useless unless it eventually agrees with experiment. That is why he was gratified 
when his theory successfully reproduced a series of experimental results: the spin 
and magnetic moment of the electron and also the correct relativistic corrections 
to the hydrogen atom’s spectra. His revolutionary insight into the structure of 
matter was vindicated in 1932 with the experimental discovery of antimatter. 
This graphic confirmation of his prediction helped to erase doubts concerning the 
correctness of Dirac’s theory of the electron. 

However, the heady days of the early 1930s, when it seemed like child's play 
to make major discoveries in quantum field theory with little effort, quickly came 
to a halt. In some sense, the early successes of the 1930s only masked the deeper 
problems that plagued the theory. Detailed studies of the higher-order corrections 
to QED raised more problems than they solved. In fact. a full resolution of 
these question would have to wait several decades. From the work of Weisskopf. 
Pauli, Oppenheimer, and many others, it was quickly noticed that QED was 
horribly plagued by infinities. The early successes of QED were premature: they 
only represented the lowest-order corrections to classical physics. Higher-order 
corrections in QED necessarily led to divergent integrals. 

The origin of these divergences lay deep within the conceptual foundation of 
physics. These divergences reflected our ignorance concerning the small-scale 
structure of space-time. QED contained integrals which diverged as x — 0, or. 
in momentum space, as k — oo. Quantum field theory thus inevitably faced 
divergences emerging from regions of space-time and matter-energy beyond 
its regime of applicability, that is, infinitely small distances and infinitely large 
energies. 
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These divergences had their counterpart in the classical “self-energy” of the 
electron. Classically, it was known to Lorentz and others near the turn of the 
century that a complete description of the electron’s self-energy was necessarily 
plagued with infinities. An accelerating electron, for example, would produce a 
radiation field that would act back on itself, creating absurd physical effects such 
as the breakdown of causality. Also, other paradoxes abounded; for example it 
would take an infinite amount of energy to assemble an electron. 

Over the decades, many of the world’s finest physicists literally brushed these 
divergent quantities under the rug by manipulating infinite quantities as if they were 
small. This clever sleight—-of—-hand was called renormalization theory, because 
these divergent integrals were absorbed into an infinite rescaling of the coupling 
constants and masses of the theory. Finally, in 1949, Tomonaga, Schwinger, and 
Feynman?~* penetrated this thicket of infinities and demonstrated how to extract 
meaningful physical information from QED, for which they received the Nobel 
Prize in 1965. 

Ironically, Dirac hated the solution to this problem. To him, the techniques 
of renormalization seemed so abstruse, so artificial, that he could never reconcile 
himself with renormalization theory. To the very end, he insisted that one must 
propose newer, more radical theories that required no renormalization whatsoever. 

Nevertheless, the experimental success of renormalization theory could not be 
denied. Its predictions for the anomalous magnetic moment of the electron, the 
Lamb shift, etc. have been tested experimentally to one part in 108, which is a 
remarkable degree of confirmation for the theory. 

Although renormalized QED enjoyed great success in the 1950s, attempts at 
generalizing quantum field theory to describe the other forces of nature met with 
disappointment. Without major modifications, quantum field theory appeared to 
be incapable of describing all four fundamental forces of nature.° These forces 
are: 


1. The electromagnetic force, which was successfully described by QED. 
2. The strong force, which held the nucleus together. 


3. The weak force, which governed the properties of certain decaying particles, 
such as the beta decay of the neutron. 


4. The gravitational force, which was described classically by Einstein’s general 
theory of relativity. 


In the 1950s, it became clear that quantum field theory could only give us a 
description of one of the four forces. It failed to describe the other interactions 
for very fundamental reasons. 
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Historically, most of the problems facing the quantum description of these 
forces can be summarized rather succinctly by the following: 


Gem ~ 1/137.0359895(61) 
Qstrong ~ 14 
Gweak ~ 1.02 x 10°°/M? 
~  1.16639(2) x 10-°Gev~? 
Gnewton ~ 5.9 x 107 /M? 
~ 6.67259(85) x 107!'m?kg~!s~? (1.1) 


where M, is the mass of the proton and the parentheses represent the uncertainties. 

Several crucial features of the various forces can be seen immediately from this 
chart. The fact that the coupling constant for QED, the “fine structure constant,” 
is approximately 1/137 meant that physicists could successfully power expand 
the theory in powers of de. The power expansion in the fine structure constant, 
called “perturbation theory,” remains the predominant tool in quantum field theory. 
The smallness of the coupling constant in QED gave physicists confidence that 
perturbation theory was a reliable approximation to the theory. However, this 
fortuitous circumstance did not persist for the other interactions. 


1.2 Strong Interactions 


In contrast to QED, the strongly interacting particles, the “hadrons” (from the 
Greek word hadros, meaning “strong”), have a large coupling constant, meaning 
that perturbation theory was relatively useless in predicting the spectrum of the 
strongly interacting particles. Unfortunately, nonperturbative methods were noto- 
riously crude and unreliable. As a consequence, progress in the strong interactions 
was painfully slow. 

In the 1940s, the first seminal breakthrough in the strong interactions was the 
realization that the force binding the nucleus together could be mediated by the 
exchange of 7 mesons: 


esp [<= 


m+n +> p (122) 


Theoretical predictions by Yukawa’® of the mass and range of the 2 meson, based 
on the energy scale of the strong interactions, led experimentalists to find the 2 
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meson in their cosmic ray experiments. The 7 meson was therefore deduced to 
be the carrier of the nuclear force that bound the nucleus together. 

This breakthrough, however, was tempered with the fact that, as we noted, the 
pion—nucleon coupling constant was much greater than one. Although the Yukawa 
meson theory as a quantum field theory was known to be renormalizable, pertur- 
bation theory was unreliable when applied to the Yukawa theory. Nonperturbative 
effects, which were exceedingly difficult to calculate, become dominant. 

Furthermore, the experimental situation became confusing when so many 
“resonances” began to be discovered in particle accelerator experiments. This 
indicated again that the coupling constant of some unknown underlying theory 
was large, beyond the reach of conventional perturbation theory. Not surpris- 
ingly, progress in the strong interactions was slow for many decades for these 
reasons. With each newly discovered resonance, physicists were reminded of the 
inadequacy of quantum field theory. 

Given the failure of conventional quantum field theory, a number of alterna- 
tive approaches were investigated in the 1950s and 1960s. Instead of focusing 
on the “field” of some unknown constituent as the fundamental object (which 
is in principle unmeasurable), these new approaches centered on the $ matrix 
itself. Borrowing from classical optics, Goldberger and his colleagues’ assumed 
the § matrix was an analytic function that satisfied certain dispersion relations. 
Alternatively, Chew® assumed a type of “nuclear democracy”; that is, there were 
no fundamental particles at all. In this approach, one hoped to calculate the S 
matrix directly, without using field theory, because of the many stringent physical 
conditions that it satisfied. 

The most successful approach, however, was the $U (3) “quark” theory of the 
strongly interacting particles (the hadrons). Gell-Mann, Ne’eman, and Zweig,?—!! 
building on earlier work of Sakata and his collaborators,!*.'3 tried to explain the 
hadron spectrum with the symmetry group SU(3). 

Since quantum field theory was unreliable, physicists focused on the quark 
model as a strictly phenomenological tool to make sense out of the hundreds of 
known resonances. Composite combinations of the “up,” “down,” and “strange” 
quarks could, in fact, explain all the hadrons discovered up to that time. Together, 
these three quarks formed a representation of the Lie group SU (3): 


Ga | od (1.3) 


The quark model could predict with relative ease the masses and properties of 
particles that were not yet discovered. A simple picture of the strong interactions 
was beginning to emerge: Three quarks were necessary to construct a baryon, 
such as a proton or neutron (or the higher resonances, such as the A, ©, Q, etc.), 
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while a quark and an antiquark were necessary to assemble a meson, such as the 
zt meson or the K meson: 


Baryons Gi 
Hadrons = am cae (1.4) 
Mesons Gd j 


Ironically, one problem of the quark model was that it was too successful. The 
theory was able to make qualitative (and often quantitative) predictions far beyond 
the range of its applicability. Yet the fractionally charged quarks themselves 
were never discovered in any scattering experiment. Perhaps they were just a 
mathematical artifice, reflecting a deeper physical reality that was still unknown. 
Furthermore, since there was no quantum field theory of quarks, it was unknown 
what dynamical force held them together. As a consequence, the model was 
unable to explain why certain bound states of quarks (called “exotics”) were not 
found experimentally. 


1.3. Weak Interactions 


Equation (1.1), which describes the coupling constants of the four fundamental 
forces, also reveals why quantum field theory failed to describe the weak inter- 
actions. The coupling constant for the weak interactions has the dimensions of 
inverse mass squared. Later, we will show that theories of this type are nonrenor- 
malizable; that is, theories with coupling constants of negative dimension predict 
infinite amplitudes for particle scattering. 

Historically, the weak interactions were first experimentally observed when 
strongly interacting particles decayed into lighter particles via a much weaker 
force, such as the decay of the neutron into a proton, electron, and antineutrino: 


n—pt+e +0 (1.5) 


These light particles, such as the electron, its neutrino, the muon j, etc., were 
called /eptons: 


Leptons = e*, v, w+, etc. (1.6) 


Fermi, back in the 1930s, postulated the form of the action that could give 
a reasonably adequate description of the lowest-order behavior of the weak 
interactions. However, any attempt to calculate quantum corrections to the 
Fermi theory floundered because the higher-order terms diverged. The theory 
was nonrenormalizable because the coupling constant had negative dimension. 
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Furthermore, it could be shown that the naive Fermi theory violated unitarity at 
sufficiently large energies. 

The mystery of the weak interactions deepened in 1956, when Lee and Yang'4 
theorized that parity conservation, long thought to be one of the fundamental 
principles of physics, was violated in the weak interactions. Their conjecture 
was soon proved to be correct by the careful experimental work of Wu and also 
Lederman and Garwin and their colleagues.'5:'6 

Furthermore, more and more weakly interacting leptons were discovered over 
the next few decades. The simple picture of the electron, neutrino, and muon was 
shattered as the muon neutrino and the tau lepton were found experimentally. Thus, 
there was the unexplained embarrassment of three exact copies or “generations” 
of leptons, each generation acting like a Xerox copy of the previous one. (The 
solution of this problem is still unknown.) 

There were some modest proposals that went beyond the Fermi action, which 
postulated the existence of a massive vector meson or W boson that mediated 
the weak forces. Buoyed by the success of the Yukawa meson theory, physicists 
postulated that a massive spin-one vector meson might be the carrier of the weak 
force. However, the massive vector meson theory, although it was on the right 
track, had problems because it was also nonrenormalizable. As a result, the mas- 
sive vector meson theory was considered to be one of several phenomenological 
possibilities, not a fundamental theory. 


1.4 Gravitational Interaction 


Ironically, although the gravitational interaction was the first of the four forces to 
be investigated classically, it was the most difficult one to be quantized. 

Using some general physical arguments, one could calculate the mass and spin 
of the gravitational interaction. Since gravity was a long-range force, it should 
be massless. Since gravity was always attractive, this meant that its spin must be 
even. (Spin-one theories, such as electromagnetism, can be both attractive and 
repulsive.) Since a spin-0 theory was not compatible with the known bending of 
starlight around the sun, we were left with a spin-two theory. A spin-two theory 
could also be coupled equally to all matter fields, which was consistent with the 
equivalence principle. These heuristic arguments indicated that Einstein’s theory 
of general relativity should be the classical approximation to a quantum theory of 
gravity. 

The problem, however, was that quantum gravity, as seen from Eq. (1.1), 
had a dimensionful coupling constant and hence was nonrenormalizable. This 
coupling constant, in fact, was Newton’s gravitational constant, the first important 
universal physical constant to be isolated in physics. Ironically, the very success 
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of Newton’s early theory of gravitation, based on the constancy of Newton’s 
constant, proved to be fatal for a quantum theory of gravity. 

Another fundamental problem with quantum gravity was that, according to 
Eq. (1.1), the strength of the interaction was exceedingly weak, and hence very 
difficult to measure. For example, it takes the entire planet earth to keep pieces 
of paper resting on a tabletop, but it only takes a charged comb to negate gravity 
and pick them up. Similarly, if an electron and proton were bound in a hydrogen 
atom by the gravitational force, the radius of the atom would be roughly the size 
of the known universe. Although gravitational forces were weaker by comparison 
to the electromagnetic force by a factor of about 10~*°, making it exceedingly 
difficult to study, one could also show that a quantum theory of gravity had the 
reverse problem, that its natural energy scale was 10!? Gev. Once gravity was 
quantized, the energy scale at which the gravitational interaction became dominant 
was set by Newton’s constant Gy. To see this, let r be the distance at which the 
gravitational potential energy of a particle of mass M equals its rest energy, so 
that Gy M?*/r = Mc’. Let r also be the Compton wavelength of this particle, so 
that r =h/Mc. Eliminating M and solving for r, we find that r equals the Planck 
length, 10—33 cm, or 10!9 GeV: 


liGwew 2 1.61605(10) x 1073 cm 


u 


(eG, \'7 1.221047(79) x 10'? GeV/c? (1.7) 


This is, of course, beyond the range of our instruments for the foreseeable future. 
So physicists were faced with the double problem: The classical theory of gravity 
was so weak that macroscopic experiments were difficult to perform in the lab- 
oratory, but the quantum theory of gravity dominated subatomic reactions at the 
incredible energy scale of 10! GeV, which was far beyond the range of our largest 
particle accelerators. 

Yet another problem arose when one tried to push the theory of gravity to 
its limits. Phenomenologically, Einstein’s general relativity has proved to be 
an exceptionally reliable tool over cosmological distances. However, when one 
investigated the singularity at the center of a black hole or the instant of the Big 
Bang, then the gravitational fields became singular, and the theory broke down. 
One expected quantum corrections to dominate in those important regions of 
space-time. However, without a quantum theory of gravity, it was impossible to 
make any theoretical calculation in those interesting regions of space and time. 

In summary, an enormous amount of information is summarized in Eq. (1.1). 
Some of the fundamental reasons why the development of quantum field theory 
was Stalled in the 1950s are summarized in this chart. 


1.5. Gauge Revolution il 


1.5 Gauge Revolution 


In the 1950s and 1960s, there was a large mass of experimental data for the 
strong and weak interactions that was patiently accumulated by many experimental 
groups. However, most of it could not be explained theoretically. There were 
significant strides taken experimentally, but progress in theory was, by contrast, 
painfully slow. 

In 1971, however, a dramatic discovery was made by G. ’t Hooft,'’ then a 
graduate student. He reinvestigated an old theory of Yang and Mills, which was 
a generalization of the Maxwell theory of light, except that the symmetry group 
was much larger. Building on earlier pioneering work by Veltman, Faddeev, 
Higgs, and others, *t Hooft showed that Yang-Mills gauge theory, even when 
its symmetry group was “spontaneously broken,” was renormalizable. With this 
important breakthrough, it now became possible to write down renormalizable 
theories of the weak interactions, where the W bosons were represented as gauge 
fields. 

Within a matter of months, a flood of important papers came pouring out. An 
earlier theory of Weinberg and Salam!*!° of the weak interactions, which was a 
gauge theory based on the symmetry group SU(2) ® U(1), was resurrected and 
given serious analysis. The essential point, however, was that because gauge 
theories were now known to be renormalizable, concrete numerical predictions 
could be made from various gauge theories and then checked to see if they 
reproduced the experimental data. If the predictions of gauge theory disagreed 
with the experimental data, then one would have to abandon them, no matter how 
elegant or aesthetically satisfying they were. Gauge theorists realized that the 
ultimate judge of any theory was experiment. 

Within several years, the agreement between experiment and the Weinberg— 
Salam theory proved to be overwhelming. The data were sufficiently accurate to 
rule out several competing models and verify the correctness of the Weinberg— 
Salam model. The weak interactions went from a state of theoretical confusion to 
one of relative clarity within a brief period of time. The experimental discovery 
of the gauge bosons W+ and Z in 1983 predicted by Weinberg and Salam was 
another important vindication of the theory. 

The Weinberg—Salam model arranged the leptons in a simple manner. It 
postulated that the (left-handed) leptons could be arranged according to SU(2) 
doublets in three separate generations: 


CG) 
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The interactions between these leptons were generated by the intermediate 
vector bosons: 

Vector mesons: W+ Zu (1.9) 

(The remaining problem with the model is to find the Higgs bosons experimen- 
tally, which are responsible for spontaneous symmetry breaking, or to determine 
if they are composite particles.) 

In the realm of the strong interactions, progress was also fairly rapid. The 
gauge revolution made possible Quantum Chromodynamics (QCD), which quickly 
became the leading candidate for a theory of the strong interactions. By postulat- 
ing a new “color” SU(3) symmetry, the Yang—Mills theory now provided a glue 
by which the quarks could be held together. [The SU(3) color symmetry should 
not be confused with the earlier SU(3) symmetry of Gell-Mann, Ne’eman, and 
Zweig, which is now called the “flavor” symmetry. Quarks thus have two indices 
on them; one index a = u, d,s, c, t, b labels the flavor symmetry, while the other 
index labels the color symmetry. ] 

The quarks in QCD are represented by: 


1 


u u u 
di @ @ 
Crs (1.10) 
Cane Cc 


where the 1, 2, 3 index labels the color symmetry. QCD gave a plausible explana- 
tion for the mysterious experimental absence of the quarks. One could calculate 
that the effective SU(3) color coupling constant became large at low energy, and 
hence “confined” the quarks permanently into the known hadrons. If this picture 
was correct, then the gluons condensed into a taffy-like substance that bound the 
quarks together, creating a string-like object with quarks at either end. If one tried 
to pull the quarks apart, the condensed gluons would resist their being separated. 
If one pulled hard enough, then the string might break and another bound quark— 
antiquark pair would be formed, so that a single quark cannot be isolated (similar 
to the way that a magnet, when broken, simply forms two smaller magnets, and 
not single monopoles). 

The flip side of this was that one could also prove that the SU(3) color 
coupling constant became small at large energies. This was called ‘‘asymptotic 
freedom,” which was discovered in gauge theories by Gross, Wilczek, Politzer, 
and ’t Hooft.*°-”* At high energies, it could explain the curious fact that the quarks 
acted as if they were described by a free theory. This was because the effective 
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coupling constants decreased in size with rising energy, giving the appearance of 
a free theory. This explained the fact that the quark model worked much better 
than it was supposed to. 

Gradually, a small industry developed around finding nonperturbative solu- 
tions to QCD that could explain the confinement of quarks. On one hand, physicists 
showed that two-dimensional “toy models” could reproduce many of the features 
that were required of a quantum field theory of quarks, such as confinement and 
asymptotic freedom. Many of these features followed from the exact solution of 
these toy models. On the other hand, a compelling description of the four dimen- 
sional theory could be achieved through Wilson’s lattice gauge theory,? which 
gave qualitative nonperturbative information concerning QCD. In the absence of 
analytic solutions to QCD, lattice gauge theories today provide the most promising 
approach to the still-unsolved problem of quark confinement. 

Soon, both the electroweak and QCD models were spliced together to be- 
come the Standard Model based on the gauge group SU(3) ® SU(2) ® U()). 
The Standard Model was more than just the sum of its parts. The leptons in 
the Weinberg—Salam model were shown to possess “anomalies” that threatened 
renormalizability. Fortunately, these potentially fatal anomalies precisely can- 
celled against the anomalies coming from the quarks. In other words, the lepton 
and quark sectors of the Standard Model cured each other’s diseases, which was 
a gratifying theoretical success for the Standard Model. As a result of this and 
other theoretical and experimental successes, the Standard Model was rapidly 
recognized to be a first-order approximation to the ultimate theory of particle 
interactions. 

The spectrum of the Standard Model for the left-handed fermions is schemati- 
cally listed here, consisting of the neutrino v, the electron e, the “up” and “down” 
quarks, which come in three “colors,” labeled by the index 7. This pattern is then 
repeated for the other two generations (although the top quark has not yet been 
discovered): 


CU) CG CVG) 


In the Standard Model, the forces between the leptons and quarks were me- 
diated by the massive vector mesons for the weak interactions and the massless 
gluons for the strong interactions: 


| Massive vectormesons: W+, Z (a2) 


Massless gluons: A/, 


The weaknesses of the Standard Model, however, were also readily apparent. 
No one saw the theory as a fundamental theory of matter and energy. Containing at 
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least 19 arbitrary parameters in the theory, it was a far cry from the original dream 
of physicists: a single unified theory with at most one undetermined coupling 
constant. 


1.6 Unification 


In contrast to the 1950s, when physicists were flooded with experimental data 
without a theoretical framework to understand them, the situation in the 1990s may 
be the reverse; that is, the experimental data are all consistent with the Standard 
Model. As a consequence, without important clues coming from experiment, 
physicists have proposed theories beyond the Standard Model that cannot be 
tested with the current level of technology. In fact, even the next generation 
of particle accelerators may not be powerful enough to rule out many of the 
theoretical models currently being studied. In other words, while experiment led 
theory in the 1950s, in the 1990s theory may lead experiment. 

At present, attempts to use quantum field theory to push beyond the Standard 
Model have met with modest successes. Unfortunately, the experimental data at 
very large energies are still absent, and this situation may persist into the near 
future. However, enormous theoretical strides have been made that give us some 
confidence that we are on the right track. 

The next plausible step beyond the Standard Model may be the GUTs (Grand 
Unified Theories), which are based on gauging a single Lie group, such as SU(5) 
or SO(10) (Fig. 1.1). 


Electricity 


Magnetism 
SU(2)@U(1) 


Weak Force SU(5), O(10) ? 


Strong Force Superstrings? 


Gravity 


Figure 1.1. This chart shows how the various forces of nature, once thought to be fun- 
damentally distinct, have been unified over the past century, giving us the possibility of 
unifying all known forces via quantum field theory. 
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According to GUT theory, the energy scale at which the unification of all 
three particle forces takes place is enormously large, about 10!> GeV, just below 
the Planck energy. Near the instant of the Big Bang, where such energies were 
found, the theory predicts that all three particle forces were unified by one GUT 
symmetry. In this picture, as the universe rapidly cooled down, the original GUT 
symmetry was broken down successively into the present-day symmetries of the 
Standard Model. A typical breakdown scheme might be: 


O(10) — SU(5) — SU) ® SU(2) ® U(1) — SU(3) @ UA) (el3) 


Because quarks and electrons are now placed in the same multiplet, it also 
means that quarks can decay into leptons, giving us the violation of baryon number 
and the decay of the proton. So far, elaborate experimental searches for proton 
decay have been disappointing. The experimental data, however, are good enough 
to rule out minimal SU(5); there are, however, more complicated GUT theories 
that can accomodate longer proton lifetimes. 

Although GUT theories are a vast improvement over the Standard Model, they 
are also beset with problems. First, they cannot explain the three generations of 
particles that have been discovered. Instead, we must have three identical GUT 
theories, one for each generation. Second, it still has many arbitrary parameters 
(e.g., Higgs parameters) that cannot be explained from any simpler principle. 
Third, the unification scale takes place at energies near the Planck scale, where 
we expect gravitational effects to become large, yet gravity is not included in the 
theory. Fourth, it postulates that there is a barren “desert” that extends for twelve 
orders of magnitude, from the GUT scale down to the electro-weak scale. (The 
critics claim that there is no precedent in physics for such an extrapolation over 
such a large range of energy.) Fifth, there is the “hierarchy problem,” meaning that 
radiative corrections will mix the two energy scales together, ruining the entire 
program. 

To solve the last problem, physicists have studied quantum field theories that 
can incorporate “‘supersymmetry,” a new symmetry that puts fermions and bosons 
into the same multiplet. Although not a single shred of experimental data supports 
the existence of supersymmetry, it has opened the door to an entirely new kind of 
quantum field theory that has remarkable renormalization properties and is still 
fully compatible with its basic principles. 

In a supersymmetric theory, once we set the energy scale of unification and 
the energy scale of the low-energy interactions, we never have to “retune” the 
parameters again. Renormalization effects do not mix the two energy scales. 
However, one of the most intriguing properties of supersymmetry is that, once it 
is made into a local symmetry, it necessarily incorporates quantum gravity into the 
spectrum. Quantum gravity, instead of being an unpleasant, undesirable feature 


16 Why Quantum Field Theory? 


of quantum field theory, is necessarily an integral part of any theory with local 
supersymmetry. 

Historically, it was once thought that all successful quantum field theories 
required renormalization. However, today supersymmetry gives us theories, like 
the SO(4) super Yang-Mills theory and the superstring theory, which are finite to 
all orders in perturbation theory, a truly remarkable achievement. For the first 
time since its inception, it is now possible to write down quantum field theories 
that require no renormalization whatsoever. This answers, in some sense, Dirac’s 
original criticism of his own creation. 

Only time will tell if GUTs, supersymmetry, and the superstring theory can 
give us a faithful description of the universe. In this book, our attitude is that they 
are an exciting theoretical laboratory in which to probe the limits of quantum field 
theory. And the fact that supersymmetric theories can improve and in some cases 
solve the problem of ultraviolet divergences without renormalization is, by itself, 
a feature of quantum field theory worthy of study. 

Let us, therefore, now leave the historical setting of quantum field theory and 
begin a discussion of how quantum field theory gives us a quantum description of 
point particle systems with an infinite number of degrees of freedom. Although 
the student may already be familiar with the foundations of classical mechanics 
and the transition to the quantum theory, it will prove beneficial to review this 
material from a slightly different point of view, that is, systems with an infinite 
number of degrees of freedom. This will then set the stage for quantum field 
theory. 


1.7 Action Principle 


Before we begin our discussion of field theory, for notational purposes it is cus- 
tomary to choose units so that: 


A = 1 (1.14) 


(We can always do this because the definition of c and A = h/27 depends on certain 
conventions that grew historically in our understanding of nature. Imposing c = 1, 
for example, means that seconds and centimeters are to be treated on the same 
footing, such that exactly 299,792,458 meters is equivalent to 1 sec. Thus, the 
second and the centimeter are to be treated as if they were expressions of the same 
unit. Likewise, setting = 1 means that the erg x sec. is now dimensionless, so 
the erg and the second are inverses of each other. This also means that the gram 
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is inversely related to the centimeter. With these conventions, the only unit that 
survives is the centimeter, or equivalently, the gram.) 

In classical mechanics, there are two equivalent formulations, one based on 
Newton’s equations of motion, and the other based on the action principle. At 
first, these two formalisms seem to have little in common. The first depends on 
iterating infinitesimal changes sequentially along a particle’s path. The second 
depends on evaluating all possible paths between two points and selecting out 
the one with the minimum action. One of the great achievements of classical 
mechanics was the demonstration that the Newtonian equations of motion were 
equivalent to minimizing the action over the set of all paths: 


Equation of motion + Action principle Cal) 


However, when we generalize our results to the quantum realm, this equiva- 
lence breaks down. The Heisenberg Uncertainty Principle forces us to introduce 
probabilities and consider all possible paths taken by the particle, with the classical 
path simply being the most likely. Quantum mechanically, we know that there 
is a finite probability for a particle to deviate from its classical equation of mo- 
tion. This deviation from the classical path is very small, on the scale determined 
by Planck’s constant. However, on the subatomic scale, this deviation becomes 
the dominant aspect of a particle’s motion. In the microcosm, motions that are 
in fact forbidden classically determine the primary characteristics of the atom. 
The stability of the atom, the emission and absorption spectrum, radioactive de- 
cay, tunneling, etc. are all manifestations of quantum behavior that deviate from 
Newton’s classical equations of motion. 

However, even though Newtonian mechanics fails within the subatomic realm, 
it is possible to generalize the action principle to incorporate these quantum 
probabilities. The action principle then becomes the only framework to calculate 
the probability that a particle will deviate from its classical path. In other words, 
the action principle is elevated into one of the foundations of the new mechanics. 

To see how this takes place, let us first begin by describing the simplest of all 
possible classical systems, the nonrelativistic point particle. In three dimensions, 
we say that this particle has three degrees of freedom, each labeled by coordinates 
qi(t). The motion of the particle is determined by the Lagrangian L(q', g'), which 
is a function of both the position and the velocity of the particle: 


ine sq’? ~ vq) (1.16) 


where V(q) is some potential in which the particle moves. Classically, the motion 
of the particle is determined by minimizing the action, which is the integral of the 
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Lagrangian: 


s=f La'.q'yat (1.17) 


hj 


We can derive the classical equations of motion by minimizing the action; that 
is, the classical Newtonian path taken by the particle is the one in which the action 
is a minimum: 
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(1.18) 


To calculate the classical equations of motion, we make a small variation in the 
path of the particle 5g'(t), keeping the endpoints fixed; that i , dq‘(t) = 8q‘(t2) = 
0. To calculate 5S, we must vary the Lagrangian with respect to both changes in 
the position and the velocity: 


i) Sie 
sS= |] dt (a 1 sai =0 (1.19) 
q q 
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We can integrate the last expression by parts: 


leer e d 6L d sat, 
OS GE 18) a rc — | dq'— = : 


Since the variation vanishes at the endpoints of the integration, we can drop the 
last term. Then the action is minimized if we demand that: 


6L da bL (1.21) 
bq’ dt 3q' 
which are called the Euler-Lagrange equations. 

If we now insert the value of the Lagrangian into the Euler-Lagrange equations, 
we find the classical equation of motion: 


d°q' dV(q) 
MA = aE (22) 
which forms the basis of Newtonian mechanics. 

Classically, we also know that there are two different formulations of Newto- 
nian mechanics, the Lagrangian formulation, where the position g' and velocity 
q' of a point particle are the fundamental variables, and the Hamiltonian formula- 
tion, where we choose the position q' and the momentum p’ to be the independent 
variables. 
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To make the transition from the Lagrangian to the Hamiltonian formulation of 
classical mechanics, we first define the momentum as follows: 


5 ov, 
P= (1.23) 
bq' 


For our choice of the Lagrangian, we find that p' = mq'. The definition of the 
Hamiltonian is then given by: 


Hq’, p') = pg’ — Lq',@') (1.24) 
With our choice of Lagrangian, we find that the Hamiltonian is given by: 
i2 


H(q', p')= = +V(q) (1.25) 


In the transition from the Lagrangian to the Hamiltonian system, we have 
exchanged independent variables from g', g' to g', p'. To prove that H(q', p') is 
not a function of g', we can make the following variation: 


Ree 3. sO), “aie 
6H = p'dg' +ép'q' — —éq' — —6q' 
éq' bg! 
ane éL , 
= g' dp’ agi Od 
q 
; OH 
= nee yf (1.26) 
dp' éq' 


where we have used the definition of p' to eliminate the dependence of the 
Hamiltonian cn g' and have used the chain rule for the last step. By equating the 
coefficients of the variations, we can make the following identification: 


g=—; -p=o— (1.27) 


where we have used the equations of motion. 
In the Hamiltonian formalism, we can also calculate the time variation of any 
field F in terms of the Hamiltonian: 
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If we define the Poisson bracket as: 


dA OB dA OB 
{A, B}pp gu & ig! agi =) (1.29) 
then we can write the time variation of the field F as: 
a Or 
P= an St ak F }pp (1.30) 


At this point, we have derived the Newtonian equations of motion by mini- 
mizing the action, reproducing the classical result. This is not new. However, 
we will now make the transition to the quantum theory, which treats the action 
as the fundamental object, incorporating both allowed and forbidden paths. The 
Newtonian equations of motion then specify the most likely path, but certainly 
not the only path. In the subatomic realm, in fact, the classically forbidden paths 
may dominate over the classical one. 

There are many ways in which to make the transition from classical mechanics 
to quantum mechanics. (Perhaps the most profound and powerful is the path 
integral method, which we present in Chapter 8.) Historically, however, it was 
Dirac who noticed that the transition from classical to quantum mechanics can be 
achieved by replacing the classical Poisson brackets with commutators: 


{A, B}pp ~+ a1, B] (1.31) 


With this replacement, the Poisson brackets between canonical coordinates are 
replaced by: 


[p', 9°] = —ifd* (1.32) 
Quantum mechanics makes the replacement: 


0 ) 
: —ik—: =i pe ; 
p i agi E ih (1.33) 


Because p’ is now an operator, the Hamiltonian also becomes an operator, and we 
can now satisfy Hamilton’s equations by demanding that they vanish when acting 
on some function w(q’, t): 


eee a 
(ea tv@)¥ = in (1.34) 


This is the Schrédinger wave equation, which is the starting point for calculating 
the spectral lines of the hydrogen atom. 
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1.8 From First to Second Quantization 


This process, treating the coordinates q' and p' as quantized variables, is called 
first quantization. However, the object of this book is to make the transition 
to second quantization, where we quantize fields which have an infinite number 
of degrees of freedom. The transition from quantum mechanics to quantum 
field theory arises when we make the transition from three degrees of freedom, 
described by x‘, to an infinite number of degrees of freedom, described by a field 
@(x') and then apply quantization relations directly onto the fields. 

Before we made the transition to quantum field theory in Chapter 3, let us 
discuss how we describe classical field theory, that is, classical systems with an 
infinite number of degrees of freedom. Let us consider a classical system, a series 
of masses arranged in a line, each connected to the next via springs (Fig. 1.2). 

The Lagrangian of the system then contains two terms, the potential energy 
of each spring as well as the kinetic energy of the masses. The Lagrangian is 
therefore given by: 


N 
=e Ea (x) — sk x, - sa? (1.35) 


i=) 


Now let us assume that we have an infinite number of masses, each separated 
by a distance ¢. In the limit « — 0, the Lagrangian becomes: 


ote a a 
Ae if dx 5 E 69 eee 4 (32) (1.36) 


Figure 1.2. The action describing a finite chain of springs, in the limit of an infinite number 
of springs, becomes a theory with an infinite number of degrees of freedom. 
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where we have taken the limit via: 


e — dx 
m 
€ 
ke — Y 
x, — (x,t) (1.37) 


where (x, t) is the displacement of the particle located at position x and time f, 
where ju is the mass density along the springs, and Y is the Young’s modulus. 

If we use the Euler-Lagrange equations to find the equations of motion, we 
arrive at: 


mp pao 
a ae Ges) 


which is just the familiar wave equation for a one-dimensional system traveling 
with velocity /Y/. 

On one hand, we have proved the rather intuitively obvious statement that 
waves can propagate down a long, massive spring. On the other hand, we have 
made the highly nontrivial transition from a system with a finite number of degrees 
of freedom to one with an infinite number of degrees of freedom. 

Now let us generalize our previous discussion of the Euler-Lagrange equations 
to include classical field theories with an infinite number of degrees of freedom. 
We begin with a Lagrangian that is a function of both the field @(x) as well as its 
space-time derivatives 0,,6(x): 


L (G(x), 8.6(x)) (1.39) 


where: 


0, = oe 1.40 
MN ar’ ax! (1.40) 


The action is given by a four dimensional integral over a Lagrangian density 4: 


L 


i d’x F(h, In) 


faxes far LC (1.41) 


integrated between initial and final times. 


~H 
i 
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As before, we can retrieve the classical equations of motion by minimizing 
the action: 


een re 64 
ss=0- fa x ( spo + ng?) (1.42) 


We can integrate by parts, reversing the direction of the derivative: 


- faxf PZ 4, 2% a 
Cor fe x ( 5 hy ag) +8 (53°) | (1.43) 


The last term vanishes at the endpoint of the integration; so we arrive at the 
Euler-Lagrange equations of motion: 


SF $sB 
i = — (1.44) 
bd,p do 
The simplest example is given by the scalar field @(x) of a point particle: 
1 
Zz 5 (3,,.09" 6 — m’¢’) (1.45) 


Inserting this into the equation of motion, we find the standard Klein—Gordon 
equation: 


0,0" +m =0 (1.46) 
where: 
7) 7) 
a (eee ae i 
‘ & a7) aide 


One of the purposes of this book is to generalize the Klein—Gordon equation 
by introducing higher spins and higher interactions. To do this, however, we must 
first begin with a discussion of special relativity. It will turn out that invariance 
under the Lorentz and Poincaré group will provide the most important guide in 
constructing higher and more sophisticated versions of quantum field theory. 


1.9 Noether’s Theorem 


One of the achievements of this formalism is that we can use the symmetries of 
the action to derive conservation principles. For example, in classical mechanics, 
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if the Hamiltonian is time independent, then energy is conserved. Likewise, if 
the Hamiltonian is translation invariant in three dimensions, then momentum is 
conserved: 


Time independence — Energy conservation 
Translation independence -—> Momentum conservation 


Rotational independence -—> Angular momentum conservation (1.48) 


The precise mathematical formulation of this correspondence is given by 
Noether’s theorem. In general, an action may be invariant under either an in- 
ternal, isospin symmetry transformation of the fields, or under some space-time 
symmetry. We will first discuss the isospin symmetry, where the fields @* vary 
according to some small parameter de. 

The action varies as: 


5S fa (Saoe" ——— ee ——_— $0," ) 


5g* 53° 
= 4 ay eA 
= [as (asp Gude"+ a grot0") 
= fas oh &. 50" (1.49) 


where we have used the equations of motion and have converted the variation of 
the action into the integral of a total derivative. This defines the current J¥*: 


ju, 5B 8g? 


: = saa (1.50) 


If the action is invariant under this transformation, then we have established that 
the current is conserved: 


Ondo) (1.51) 


From this conserved current, we can also establish a conserved charge, given by 
the integral over the fourth component of the current: 


QO. = / dnd) (1.52) 
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Now let us integrate the conservation equation:, 


i} ax 3, Jt = i) ax dy Jo + / dx, Ji 


d : 
P Ged. + f asia: = “Qa + surface term (1.53) 


S 
Il 


Let us assume that the fields appearing in the surface term vanish sufficiently 
rapidly at infinity so that the last term can be neglected. Then: 


dQ,(t) 
ra) J#=0 —_—_——- 
a at 


(1.54) 


In summary, the symmetry of the action implies the conservation of a current 
Jy, which in turn implies a conservation principle: 


Symmetry — Current conservation — Conservation principle (155) 


Now let us investigate the second case, when the action is invariant under 
the space-time syminetry of the Lorentz and Poincaré groups. Lorentz symmetry 
implies that we can combine familiar three-vectors, such as momentum and space, 
into four-vectors. We introduce the space-time coordinate x” as follows: 


x= eH x) (1.56) 


where the time coordinate is defined as x° = 1. 
Similarly, the momentum three-vector p can be combined with energy to form 
the four-vector: 


Pp = (pop )=(E, p) (1.57) 


We will henceforth use Greek symbols from the middle of the alphabet yz, v to 
represent four-vectors, and Roman indices from the middle of the alphabet i, /, k 
to represent space coordinates. 

We will raise and lower indices by using the following metric g,, as follows: 


Ay = 8uvA” (1.58) 


where: 


Suv = (1.59) 


ooo = 
jo) 
| 
— 
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Now let us use this formalism to construct the current associated with making a 
translation: 


xh + xP +a (1.60) 


where a“ is aconstant. a° represents time displacements, and a' represents space 
displacements. We will now rederive the result from classical mechanics that 
displacement in time (space) leads to the conservation of energy (momentum). 

Under this displacement, a field @(x) transforms as $(x) — (x +a). For 
small a’, we can power expand the field in a power series in a“. Then the change 
in the field after a displacement is given by: 


dh = P(x +a) — $(x) ~ O(x) + a* 0, G(x) — G(x) = a" 0, G(X) (1.61) 


Therefore, if we make a translation 6x“ = a”, then the fields transform, after 
making a Taylor expansion, as follows: 


og 
Sdn 


a“ aug 
a” a, a,b (1.62) 


The calculation of the current associated with translations proceeds a bit dif- 
ferently from the isospin case. There is an additional term that one must take 
into account, which is the variation of the Lagrangian itself under the space-time 
symmetry. The variation of our Lagrangian is given by: 


5B a 
Z— gt Z=- es 
54% =a On ; bp + 39, bd. (1.63) 


Substituting in the variation of the fields and using the equations of motion, we 
find: 


6 


ll 
Q 

= 
= 
SS 


6 F 6B 


= a’ ay, (555°) (1.64) 
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Combining both terms under one derivative, we find: 


A 
53,6 


ay Ez = a6 a” =0 (1.65) 


This then defines the energy-momentum tensor T“,: 


64 
Hf Mate = dy a Pee a fa 
50,0 od Meus (1.66) 
which is conserved: 
On1°,=0 (1.67) 


If we substitute the Klein—Gordon action into this expression, we find the energy— 
momentum tensor for the scalar particle: 


Tyv = 9400y0 — Suv (1.68) 


Using the equations of motion, we can explicitly check that this energy-momentum 
tensor is conserved. 

By integrating the energy-momentum tensor, we can generate conserved cur- 
rents, as we saw earlier. As the name implies, the conserved charges corresponding 
to the energy-momentum tensor give us energy and momentum conservation. Let 
us define: 


PY Srey (1.69) 
which combines the energy P° = E and the momentum P’ into a single four- 


vector. We can show that energy and momentum are both conserved by making 
the following definition: 


PH 


/ ad ade) 
d 


a Be = a0 (1.70) 

The conservation of energy-momentum is therefore a consequence of the invari- 

ance of the action under translations, which in turn corresponds to invariance 

under time and space displacements. Thus, we now have derived the result from 
classical Newtonian mechanics mentioned earlier in Eq. (1.48). 

Next, let us generalize this discussion. We know from classical mechanics 

that invariance of the action under rotations generates the conservation of angular 
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momentum. Now, we would like to derive the Lorentz generalization of this result 
from classical mechanics. 
Rotations in three dimensions are described by: 


bx! =a’ x! Ge) 


where a‘/ is an anti-symmetric matrix describing the rotation (i.e., a// = —a/‘). 

Let us now construct the generalization of this rotation, the current associated 
with Lorentz transformations. We define how a four—vector x“ changes under a 
Lorentz transformation: 


Lorentz transformation : e722) 
d(x) = €* yx" d,P(x) 


where €*“,, is an infinitesimal, antisymmetric constant matrix (i.e., «%” = —e”*). 
Repeating the same steps with this new variation, we have: 
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64 64 
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5 
= a, (Gs ad 5.0) (1.73) 


If we extract the coefficient of «”, and put everything within the partial 
derivative 0,, we find: 


Cop (a _ — a’ px") — BPP XY F + §PP xt Z) =( (1.74) 
This gives us the conserved current: 


Mb" = TPYyH _ Tey 


0, FP” = 0 (75) 


and the conserved charge: 


x 
s 
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/ d>x Goo 


Ss 
g 
n 


abs 0 (1.76) 
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If we restrict our discussion to rotations in three dimensional space, then Mi = 
O corresponds to the conservation of angular momentum. If we take all the 
components of this matrix, however, we find that Lorentz transformations are an 
invariant of the action. 

There is, however, a certain ambiguity in the definition of the energy—momen- 
tum tensor. The energy-momentum tensor is not a measurable quantity, but the 
integrated charges correspond to the physical energy and momentum, and hence 
are measurable. 

We can exploit this ambiguity and add to the energy-momentum tensor a term: 


0, EY (1.77) 
where E’“” is antisymmetric in the first two indices: 
EN — Eee (1.78) 
Because of this antisymmetry, this tensor satisfies trivially: 
0,3, E*#” =0 (1.79) 
So we can make the replacement: 
Te — TH +9, BAe (1.80) 
This new energy-momentum tensor is conserved, like the previous one. We can 
choose this tensor such that the new energy-momentum tensor is symmetric in 
and v. 
The addition of this extra tensor to the energy-momentum tensor does not 


affect the energy and the momentum, which are measurable quantities. If we take 
the integrated charge, we find that the contribution from E*“” vanishes: 


— pis f a°xd, 6% 
=r a / Opa x 


= Pht / Bods, 
Si 
= pH (81) 


Thus, the physical energy and momentum are not affected as long as this tensor 
vanishes sufficiently rapidly at infinity. 
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The purpose of adding this new term to the energy-momentum tensor is 
that the original one was not necessarily symmetric. This is a problem, since 
the conservation of angular momentum requires a symmetric energy-momentum 
tensor. For example, if we take the divergence of 4°", we find that it does not 
vanish in general, but equals T“” — T””. However, we can always choose pre 
such that the energy-momentum tensor is symmetric, so angular momentum is 
conserved. 

Yet another reason for requiring a symmetric energy-momentum tensor is that 
in general relativity, the gravitational field tensor, which is symmetric, couples to 
the energy-momentum tensor. By the equivalence principle, the gravitational field 
couples equally to all forms of matter via its energy-momentum content. Hence, 
when we discuss general relativity in Chapter 19, we will need a symmetric 
energy—momentum tensor. 

In summary, in this chapter we have made the transition from a classical 
system with a finite number of degrees of freedom to a classical field theory with 
an infinite number of degrees of freedom. Instead of a one-particle, classical 
description of a point particle in terms of coordinates, we now have a classical 
formalism defined in terms of fields #(x). 

In constructing field theories, we clearly see that the study of symmetries 
plays a crucial role. In fact, over the past two decades, physicists have come to 
the realization that symmetries are perhaps the most powerful tool that we have 
in describing the physical universe. As a consequence, it will prove beneficial to 
begin a more systematic discussion of symmetries and group theory. With this 
foundation, many of the rather strange and seemingly arbitrary conventions of 
quantum field theory begin to take on an elegant and powerful form. 

Therefore, in the next chapter we will discuss various symmetries that have 
been shown experimentally to describe the fundamental particles of nature. Then 
in Chapter 3 we will begin a formal introduction to the quantum theory of systems 
with an infinite number of degrees of freedom. 


1.10 Exercises 


1. Show that the Poisson brackets obey the Jacobi identity: 
{A, {B, C}} +{B, {C, A}} +{C, {A, B}} =0 (1.82) 


2. A transformation from the coordinates p and q to the new set P = P(q, p, ft) 
and Q = Q(q, p,t) is called canonical if Hamilton’s equations in Eq. (1.27) 
are satisifed with the new variables when we introduce a new Hamiltonian 
H (Q, P,t). Show that the Poisson brackets between P and Q are the same 
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as those between p and qg. Thus, the Poisson brackets of the coordinates are 
preserved under a canonical transformation. 


3. Show that Poisson’s brackets of two arbitrary functions A and B are invariant 
under a canonical transformation. 


4. Since the action principle must be satisfied in the new coordinates, then 
pq — H must be equal to PQ — H up toa total derivative, that is, up to some 
arbitrary function F. Show that, without losing any generality, we can take 
F to be one of four functions, given by: F\(g, QO, t), Fo(q, P,t), F3(p, Q,¢) 
oratp, P,t). 


5. If~werchoose F = F,(q. Q, 1), then prove'that p = dF ,/dg, P = —dF /0Q, 
and H =H+ F,. 


6. What are the analogous relations for the other three F; functions? 


Chapter 2 
Symmetries and Group Theory 


...although the symmetries are hidden from us, we can sense that 
they are latent in nature, governing everything about us. That's the 
most exciting idea I know: that nature is much simpler than it looks. 
Nothing makes me more hopeful that our generation of human beings 
may actually hold the key to the universe in our hands—that perhaps 
in our lifetimes we may be able to tell why all of what we see in 
this immense universe of galaxies and particles is logically inevitable. 

—S. Weinberg 


2.1 Elements of Group Theory 


So far, we have only described the broad, general principles behind classical field 
theory. In this chapter, we will study the physics behind specific models. We 
must therefore impose extra constraints on our Lagrangian that come from group 
theory. These groups, in turn, are extremely important because they describe the 
symmetries of the subatomic particles found experimentally in nature. Then in 
the next chapter, we will make the transition from classical field theory to the 
quantum theory of fields. 

The importance of symmetries is seen when we write down the theory of 
radiation. When we analyze Maxwell fields, we find that they are necessarily 
relativistic. Therefore, we can also say that that quantum field theory arises out of 
the marriage of group theory (in particular the Lorentz and Poincaré groups) and 
quantum mechanics. Roughly speaking, we have: 


Group theory 


Quantum field theory = (271) 


Quantum mechanics 


In fact, once the group structure of a theory (including the specific represen- 
tations) are fixed, we find that the $ matrix is essentially unique, up to certain 
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parameters specifying the interactions. More precisely, we will impose the fol- 
Jowing constraints in constructing a quantum field theory: 


1. We demand that all fields transform as irreducible representations of the 
Lorentz and Poincaré groups and some isospin group. 


2. We demand that the theory be unitary and the action be causal, renormalizable, 
and an invariant under these groups. 


As simple as these postulates are, they impose enormous constraints on the theory. 
The first assumption will restrict the fields to be massive or massless fields of 
spin 0, 1/2, 1, etc. and will fix their isotopic representations. However, this 
constraint alone cannot determine the action, since there are invariant theories that 
are noncausal or nonunitary. (For example, there are theories with three or higher 
derivatives that satisfy the first condition. However, higher derivative theories 
have “ghosts,” or particles of negative norm that violate unitarity, and hence 
must be ruled out.) The second condition, that the action obeys certain physical 
properties, then fixes the action up to certain parameters and representations, such 
as the various coupling constants found in the interaction. 

Because of the power of group theory, we have chosen to begin our discussion 
of field theory with a short introduction to representation theory. We will find 
this detour to be immensely important; many of the curious “accidents” and 
conventions that occur in field theory, which often seem contrived and artificial, 
are actually byproducts of group theory. 

There are three types of symmetries that will appear in this book. 


1. Space-time symmetries include the Lorentz and Poincaré groups. These 
symmetries are noncompact, that is, the range of their parameters does not 
contain the endpoints. For example, the velocity of a massive particle can 
range from 0 to c, but cannot reach c. 


2. Internal symmetries are ones that mix particles among each other, for example, 
symmetries like SU(N) that mix N quarks among themselves. These internal 
symmetries rotate fields and particles in an abstract, “isotopic space,” in 
contrast to real space-time. These groups are compact, that is, the range 
of their parameters is finite and contains their endpoints. For example, the 
rotation group is parametrized by angles that range between 0 and 7 or 27. 
These internal symmetries can be either global (i.e., independent of space— 
time) or /ocal, as in gauge theory, where the internal symmetry group varies 
at each point in space and time. 


3. Supersymmetry nontrivially combines both space-time and internal symme- 
tries. Historically, it was thought that space-time and isotopic symmetries 
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were distinct and could never be unified. _‘‘No-go theorems,” in fact, were 
given to prove the incompatibility of compact and noncompact groups. At- 
tempts to write down a nontrivial union of these groups with finite-dimensional 
unitary representations inevitably met with failure. Only recently has it be- 
come possible to unify them nontrivially and incorporate them into quantum 
field theories with supersymmetry, which manifest remarkable properties that 
were previously thought impossible. For example, certain supersymmetric 
theories are finite to all orders in perturbation theory, without the need for any 
renormalization. (However, since elementary particles with supersymmetry 
have yet to be discovered, we will only discuss this third class of symmetry 
later in the book in Chapters 20 and 21.) 


2.2 SO(2) 


We say that a collection of elements g; form a group if they either obey or possess 
the following: 


1. Closure under a multiplication operation; that is, if g; and g; are members of 
the group, then g; - g; is also a member of the group. 


2. Associativity under multiplication; that is, 


8° (8; ° Bk) = (81° Bj) * Bk (2.2) 


3. An identity element 1; that is, there exists an element 1 such that g;-1 = 1-9; = 
8i- 


4, An inverse; that is, every element g; has an element eo such that g; - ae ir 


There are many kinds of groups. A discrete group has a finite number of 
elements, such as the group of rotations that leave a crystal invariant. An important 
class of discrete groups are the parity inversion P, charge conjugation C, and 
time-reversal symmetries T. At this point, however, we are more interested in the 
continuous groups, such as the rotation and Lorentz group, which depend on a set 
of continuous angles. 

To illustrate some of these abstract concepts, it will prove useful to take the 
simplest possible nontrivial example, O(2), or rotations in two dimensions. Even 
the simplest example is surprisingly rich in content. Our goal is to construct the 
irreducible representations of O(2), but first we have to make a few definitions. 
We know that if we rotate a sheet of paper, the length of any straight line on the 
paper is constant. If (x, y) describe the coordinates of a point on a plane, then 
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this means that, by the Pythagorean theorem, the following length is an invariant 
under a rotation about the origin: 


Invariant: x? + y’ (2.3) 


If we rotate the plane through angle @, then the coordinates (x’, y’) of the same 
point in the new system are given by: 


xi cosé sind x 
Lhe (2.4) 
y —sin@ cosé y 
We will abbreviate this by: 
x!’ = OF (6)x! (2.5) 
where x! = x and x* = y. (For the rotation group, it makes no difference whether 


we place the index as a superscript, as in x‘, or as a subscript, as in x;.) 
For small angles, this can be reduced to: 


dbx =Oy; dy=-—Ox (2.6) 

or simply: 
bx! = 6e/x/ (2.7) 
where ¢'/ is antisymmetric and «!* = —e*! = 1. These matrices form a group; 


for example, we can write down the inverse of any rotation, given by O~'(6) = 
O(-@): 


0(6)O(—6) =1= Ly 2.8 
Se Wb) A 2 


We can also prove associativity, since matrix multiplication is associative. 

The fact that these matrices preserve the invariant length places restrictions 
on them. To find the nature of these restrictions, let us make a rotation on the 
invariant distance: 


iv ba 2 a 
x! x! = Ox ox 


J {otio# lst 


= xx! (2.9) 


2.2. SO(2) 37 
that is, this is invariant if the O matrix is orthogonal; 
oO = 6H (2.10) 
or, more symbolically: 
CAO] (2.11) 
To take the inverse of an orthogonal matrix, we simply take its transpose. The 
unit matrix 1 is called the metric of the group. 
The rotation group O(2) is called the orthogonal group in two dimensions. 
The orthogonal group O(2), in fact, can be defined as the set of all real, two- 


dimensional orthogonal matrices. Any orthogonal matrix can be written as the 
exponential of a single antisymmetric matrix T: 


te — 1 
O()=e" = S — (6ry" (2.12) 
where: 
0 1 
= Dal 
T ( 10 ) (2:13) 
To see this, we note that the transpose of e% is e—%: 
OT =(e)' =e" =07! (2.14) 


Another way to prove this identity is simply to power expand the nght-hand 
side and sum the series. We then see that the Taylor expansion of the cosine and 
sine functions re-emerge. After summing the series, we arrive at: 


(2.15) 


: cos @ sin @ 
e** =cos01+t sind = 
—sin@ cosé 


All elements of O(2) are parametrized by one angle 9. We say that O(2) is a 
one-parameter group; that is, it has dimension 1. 
Let us now take the determinant of both sides of the defining equation: 


det (007) =det O det O7 = (det OY =1 (2.16) 


This means that the determinant of O is equal to +1. If we take det O = 1, then 
the resulting subgroup is called $O(2), or the special orthogonal matrices in two 


38 Symmetries and Group Theory 


dimensions. The rotations that we have been studying up to now are members of 
5 O(2). However, there is also the curious subset where det O = —1. This subset 
consists of elements of SO(2) times the matrix: 


1 O 
ald 
eae) 17 


This last transformation corresponds to a parity transformation: 


Of aes 


yu -y (2.18) 


A parity transformation P takes a plane and maps it into its mirror image, and 
hence it is a discrete, not continuous, transformation, such that P? = 1. 

An important property of groups is that they are uniquely specified by their 
multiplication law. It is easy to show that these two dimensional matrices O'/ can 
be multiplied in succession as follows: 


01 (0)0/*(6’) = O%'(0 +8’) (2.19) 
which simply corresponds to the intuitively obvious notion that if we rotate a 
coordinate system by an angle @ and then by an additional angle 6’, then the net 
effect is a rotation of 6 + 0’. In fact, any matrix D(@) (not necessarily orthogonal 
or even 2 x 2 in size) that has this multiplication rule: 

D(6)D(6’) = D(6 +6’); D(@) = D6 +27) (2.20) 

forms a representation of O(2), because it has the same multiplication table. 
For our purposes, we are primarily interested in the transformation properties 


of fields. For example, we can calculate how a field @(x) transforms under 
rotations. Let us introduce the operator: 


Piaetis 
mei ae lind, een 
L=ievx am =1(x 0° — x*0°) (2.21) 
Let us define: 
U@y=e" (2.22) 
Then we define a scalar field as one that transforms under S$ O(2) as: 


Scalar: U(0)¢(x)U~'(6) = d(x’) (2.23) 


2.3. Representations of §O(2) and U(1) 39 


(To prove this equation, we use the fact that: 
1 1 
CBO = BEPAMB) + = (AVIA. Bll + xl (AMPARIIeE--- area) 


Then we reassemble these terms via a Taylor expansion to prove the transformation 
law.) 

We can also define a vector field ¢'(x), where the additional i index also 
transforms under rotations: 


Vector: U(6)d'(x)U~'(6) = O11 (—6)d/(x’) (2.25) 


[For this relation to hold, Eq. (2.21) must contain an additional term that 
rotates the vector index of the field.] Not surprisingly, we can now generalize 
this formula to include the transformation property of the most arbitrary field. 
Let #4(x) be an arbitrary field transforming under some representation of §O(2) 
labeled by some index A. Then this field transforms as: 


U(0)b4(x)U (6) = D*4(-0) $7 (x’) (2.26) 


where (“8 is some representation, either reducible or irreducible, of the group. 
p group 


2.3 Representations of SO(2) and U(1) 


One of the chief goals of this chapter is to find the irreducible representations of 
these groups, so let us be more precise. If g; is a member of a group G, then the 
object D(g;) is called a representation of G if it obeys: 


D(gi)D(gj) = D(8i8;) (2.27) 


for all the elements in the group. In other words, D(g;) has the same multiplication 
rules as the original group. 

A representation is called reducible if D(g;) can be brought into block diagonal 
form; for example, the following matrix is a reducible representation: 


D;(gi) 0 0 
D(g;) = 0 D2(gi) 0 (2.28) 
0 0 D3(gi) 


where Dj are smaller representations of the group. Intuitively, this means D(g;) 
can be split up into smaller pieces, with each piece transforming under a smaller 
representation of the same group. 


40 Symmetries and Group Theory 


The principal goal of our approach is to find all irreducible representations 
of the group in question. This is because the basic fields of physics transform 
as irreducible representations of the Lorentz and Poincaré groups. The complete 
set of finite-dimensional representations of the orthogonal group comes in two 
classes, the tensors and spinors. (For a special exception to this, see Exercise 10.) 

One simple way of generating higher representations of O(2) is simply to 
multiply several vectors together. The product A‘ B/, for example, transforms as 
follows: 


(4’B’) =[0"@)0/'1)] (4'B’) (2.29) 


This matrix O''(9)O//(@) forms a representation of SO(2). It has the same 
multiplication rule as O(2), but the space upon which it acts is 2 x 2 dimensional. 
We call any object that transforms like the product of several vectors a tensor. 

In general, a tensor T/*"’ under O(2) is nothing but an object that transforms 
like the product of a series of ordinary vectors: 


Tensor: (T/)'?"" = Old Qk... THe (2.30) 


The transformation of T/* is identical to the transformation of the product 
x'x/x*.... This product forms a representation of O(2) because the following 


matrix: 
Obvi2- ins dtd20-""JN (Q) = ors (0)02(0) ... Qinin (0) (2.31) 


has the same multiplication rule as §O(2). 

The tensors that we can generate by taking products of vectors are, in general, 
reducible; that is, within the collection of elements that compose the tensor, we 
can find subsets that by themselves form representations of the group. By taking 
appropriate symmetric and antisymmetric combinations of the indices, we can 
extract irreducible representations (see Appendix). 

A convenient method that we will use to create irreducible representations is 
to use two tensors under O(2) that are actually constants: 5’/ and €’/, where the 
latter is the antisymmetric constant tensor and €!? = —e*! = +1. 

Although they do not appear to be genuine tensors, it is easy to prove that they 
are. Let us hit them with the orthogonal matrix O/: 


ey = OM Os igii 
e! J = oft origi (2.32) 
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We instantly recognize the first equation: it is just the definition of an orthogonal 
matrix, and so 6 is an invariant tensor. The second equation, however, is more 
difficult to see. Upon closer inspection, it is just the definition of the determinant of 
the O matrix, which is equal to one for SO(2). Thus, both equations are satisfied 
by construction. Because the ¢'! transforms like a tensor only if the determinant 
of O is +1, we sometimes call it a pseudotensor. The pseudotensors pick up an 
extra minus one when they are transformed under parity transformations. 

Using these two constant tensors, for example, we can immediately contract the 
tensor A‘ B/ to form two scalar combinations: A‘ B! and A‘e! B/ = A! B*— A7B!. 

This process of symmetrizing and antisymmetrizing all possible tensor indices 
to find the irreducible representations is also aided by the simple identities: 


et! = sik gi! pa 1! g sk 


evel =) 25% (2.33) 


Finally, we can show the equivalence between O(2) and yet another formulation. 
We can take a complex object u = a + ib, and say that it transforms as follows: 


u’ = U(0)u = eu (2.34) 
The matrix U(@) is called a unitary matrix, because: 
Unitary matrix: Ux U'=1 (2.35) 


The set of all one-dimensional unitary matrices U(6) = e’” defines a group called 
U(1). Obviously, if we make two such transformations, we find: 


ci eid’ — pi6+ia’ (2.36) 
We have the same multiplication law as O(2), even though this construction is 
based on a new space, the space of complex one-dimensional numbers. We thus 
say that: 


SO(2) ~ U(1) (2.37) 


This means that there is a correspondence between the two, even though they are 
defined in two different spaces: 


et el? (2.38) 
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To see the correspondence between O(2) and U(1), let us consider two scalar 
fields ¢! and ¢? that transform infinitesimally under SO(2) as in Eq. (2.6): 


dp! = Oe pb! (2.39) 
which is just the usual transformation rule for small 6. Because SO(2) ~ U(1), 


these two scalar fields can be combined into a single complex scalar field: 


je =! +id?) (2.40) 


Then the infinitesimal variation of this field under U(1) is given by: 
6p = —i6o (2.41) 


for small @. Invariants under O(2) or U(1) can be written as: 


5016 = 0" (2.42) 


2.4 Representations of S$ O(3) and SU (2) 


The previous group O(2) was surprisingly rich in its representations. It was also 
easy to analyze because all its elements commuted with each other. We call such 
a group an Abelian group. Now, we will study non-Abelian groups, where the 
elements do not necessarily commute with each other. We define O(3) as the 
group that leaves distances in three dimensions invariant: 


Invariant: x7 + y*+2z" (2.43) 


where x'’ = O/x/, Generalizing the previous steps for SO(2), we know that the 
set of 3 x 3, real, and orthogonal matrices O(3) leaves this quantity invariant. The 
condition of orthogonality reduces the number of independent numbers down to 
9 — 6 =3 elements. Any member of O(3) can be written as the exponential of an 
antisymmetric matrix: 


a 
O =exp ( So!) (2.44) 


=| 


where t' has purely imaginary elements. There are only three independent anti- 
symmetric 3 x 3 matrices, so we have the correct counting of independent degrees 
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of freedom. Therefore O(3) is a three-parameter Lie group, parametrized by three 
angles. 
These three antisymmetric matrices t' can be explicitly written as: 


0 0 O a0 <1 
Cs =—1/ 0 00919 |e or? = 7" SS, 0 (2.45) 
OME 10 xn -O 
0 10 
T=t=-1| =1 070 (2.46) 
0 


By inspection, this set of matrices can be succinctly represented by the fully 
antisymmetric ¢'/* tensor as: 


Gi)i® =e" (2.47) 


where €!?3 = +1. These antisymmetric matrices, in turn, obey the following 
properties: 


[ec] = lew (2.48) 
This is an example of a Lie algebra (not to be confused with the Lie group). 
The constants ¢'/* appearing in the algebra are called the structure constants of 
the algebra. A complete determination of the structure constants of any algebra 


specifies the Lie algebra, and also the group itself. 
For small angles 6', we can write the transformation law as: 


bx Sel! (2.49) 
As before, we will introduce the operators: 
L! = ie?*x/a* (2.50) 


We can show that the commutation relations of L' satisfy those of $O(3). Let us 
construct the operator: 


U(6!) = el E’ (2.51) 
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Then a scalar and a vector field, as before, transform as follows: 


U(O)o(x)U~'(6*) = bx’) 


U(0")b'(x)U~'(6*) (O—')/ 6")! (x’) (2.52) 


For higher tensor fields, we must be careful to select out only the irreducible 
fields. The easiest way to decompose a reducible a tensor is to take various 
symmetric and anti-symmetric combinations of the indices. Irreducible represen- 
tations can be extracted by using the two constant tensors, 5/ and ¢'/*. In carrying 
out complicated reductions, it is helpful to know: 


elk eimn = i! gsm gkn = 5! gi gkm ee gim gin ski 
—gim gi! gkn + gir gd! gkm = gin gim kl 


eiik kim = bilgi = git gs! (2.53) 


More generally, we can use the method of Young Tableaux described in the 
Appendix to find more irreducible representations. 

As in the case of O(2), we can also find a relationship between O(3) and a 
unitary group. Consider the set of all unitary, 2 x 2 matrices with unit determinant. 
These matrices form a group, called SU(2), which is called the speciai unitary 
group in two dimensions. This matrix has 8 — 4 — 1 = 3 independent elements in 
it. Any unitary matrix, in turn, can be written as the exponential of a Hermitian 
matrix H, where H = Ht: 


U=el# (2.54) 

Again, to prove this relation, simply take the Hermitian conjugate of both sides: 
Ut =e"! = eH =y-!, 

Since an element of $U(2) can be parametrized by three numbers, the most 


convenient set is to use the Hermitian Pauli spin matrices. Any element of SU(2) 
can be written as: 


U= ei'o'/2 (2.55) 


wil? en af a es : 
“11 of ° SN Sean yen oS 
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where a! satisfy the relationship: 
i Jj ok 
E =| = iellt (2.57) 


We now have exactly the same algebra as §O(3) as in Eq. (2.48). Therefore, we 
can say: 


SO(3) ~ SU(2) (2.58) 


To make this correspondence more precise, we will expand the exponential and 
then recollect terms, giving us: 


ei?’ /2 = cos(0/2) + i(a*n*) sin(6 /2) (2.59) 
where 6! = n'6 and (n')* = 1. The correspondence is then given by: 
pitie! ak eia/G! /2 (2.60) 


where the left-hand side is a real, 3 x 3 orthogonal matrix, while the right-hand 
size is a complex, 2 x 2 unitary matrix. (The isomorphism is only local, i.e., 
within a small neighborhood of each of the parameters. In general, the mapping is 
actually one-to-two, and not one-to-one.) Even though these two elements exist 
in different spaces, they have the same multiplication law. This means that there 
should also be a direct way in which vectors (x, y, z) can be represented in terms 
of these spinors. To see the precise relationship, let us define: 


z x —iy 
h(x)=0-x= ( , (2.61) 
x+iy Zz 
Then the SU (2) transformation: 
h' =UhU™ (2.62) 


is equivalent to the SO(3) transformation: 


x’-O-x (2.63) 
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2.5 Representations of SO(N) 


By now, the generalization to O(N) should be straightforward. (The represen- 
tations of SU(N), which are important when we discuss the quark model, are 
discussed in the Appendix.) 

The essential feature of arotation in N dimensions is that it preserves distances. 
Specifically, the distance from the origin to the point x', by the Pythagorean 
Theorem, is given by /(x')?. Therefore, x‘x' is an invariant, where an N- 
dimensional rotation is defined by: x" = O¥/x/, 

The number of independent elements in each member of O(N) is N 2 minus 
the number of constraints arising from the orthogonality condition: 


N= 5 NN +1)= 5N(N nt) (2.64) 


This is exactly the number of independent antisymmetric, N x N matrices, that is, 
we can parametrize the independent components within O(N) by either orthogonal 
matrices or by exponentiating antisymmetric ones, (i.e., O = e“). 

Any orthogonal matrix can thereby be parametrized as follows: 


N(N—1)/2 a 
O =exp ( a a!) (2.65) 


i=] 


where t' are linearly independent, antisymmetric matrices with purely imaginary 
elements. They are called the generators of the group, and 6! are the rotation 
angles or the parameters of the group. 

Finding representations of O(N) is complicated by the fact that the multiplica- 
tion table for the parameters of O(N) are quite complicated. However, we know 
from the Baker—Campbell—Hausdorff theorem that: 


eh eB = pAtBH1/2)1A, B+ (2.66) 


where the ellipsis represent multiple commutators of A and B. If e4 and e® are 
close together, then these elements form a group as long as the commutators of A 
and B form an algebra. 

If we take the commutator of two antisymmetric matrices, then it is easy to 
see that we get another antisymmetric matrix. This means that the algebra created 
by commuting all antisymmetric matrices is a closed one. We will represent the 
algebra as follows: 


[c', ce] =ifl*ee (2.67) 


2.5. Representations of § O(N) 47 


As before, we say that the f/* are the structure constants of the group. (It 
is Customary to insert an i in the definition of t, so that an i appears in the 
commutator.) 


For arbitrary N, it is possible to find an exact form for the structure constants. 
Let us define the generator of O(N) as: 


(M')q, = —i(5/.) — 6/61) (2.68) 


Since the matrix is antisymmetric in i, 7, there are N(N — 1)/2 such matrices, 
which is the correct number of parameters for O(N). The indices a, b denote 
the various matrix entries of the generator. If we commute these matrices, the 
calculation is rather easy because it reduces to contracting over a series of delta 
functions: 


[M4 mM") = (-6" Mi" es $/! Mim + 5m yi! be: 8” M'') (2.69) 
To define the action of $O(N) on the fields, let us define the operator: 
Li = i(x'as — x4a') (2.70) 


It is easy to check that L"! satisfies the commutation relations of SO(N). Now 
construct the operator: 


U (i) = 9? L" (2.71) 


where 6"/ is antisymmetric. The structure constants of the theory f'/* can also be 
thought of as a representation of the algebra. If we define: 


(ci) = fh (2.72) 


then t’ as written as a function of the structure constants also forms a representation 
of the generators of O(N). We call this the adjoint representation. This also means 
that the structure constant f‘/* is a constant tensor, just like 5'/. 

For our purposes, we will often be interested in how fields transform under 
some representation of O(N). Without specifying the exact representation, we 
can always write: 

i_:g9a a i 
bg’ =i0 (x Je od (273) 
This simply means that we are letting ¢' transform under some unspecified repre- 
sentation of the generators of O(N), labeled by the indices 1, /. 
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Finally, one might wonder whether we can find more identities such as O(2) ~ 
U(1) and O(3) ~ SU(2). With a little work, we can show: 


SO(4) ~ SU(2) @ SU(2); SO) ~ SU(A) (2.74) 


One may then wonder whether there are higher sequences of identities between 
O(N) and SU(M). Surprisingly, the answer is no. To see that any correspondence 
would be nontrivial can be checked by simply calculating the number of parameters 
in O(N) and SU(M), which are quite different. These “accidents” between Lie 
groups only occur at low dimensionality and are the exception, rather than the 
tule. 


2.6 Spinors 


In general, there are two major types of representations that occur repeatedly 
throughout physics. The first, of course, are tensors, which transform like the 
product of vectors. Irreducible representations are then found by taking their 
symmetric and antisymmetric combinations. 

However, one of the most important representations of O(N) are the spinor 
representations. To explain the spinor representations, let us introduce N objects 
I“, which obey: 


(en) =o) 


{A, B} AB+BA (2.75) 


where the brackets represent an anticommutator rather than a commutator. This 
is called a Clifford algebra. Then we can construct the spinor representation of 
the generators of SO(N) as follows: 


Spinor representation: M'! = aie TY] (2.76) 


By inserting this value of M‘/ into the definition of the algebra of O(N), it satisfies 
the commutation relations of Eq. (2.69) and hence gives us a new representation 
of the group. 

In general, we can find a spinorial matrix representation of O(N) (for N even) 
that is 2/2 dimensional and complex. The simplest spinor representation of O(4), 
we will see, gives us the compact version of the celebrated Dirac matrices. For the 
odd orthogonal groups O(N + 1) (where N is even), we can construct the spinors 
I’ from the spinors for the group O(N). We simply add a new element to the old 
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set: 
Dye lt ek” (2.77) 


I'y41 has the same anti-commutation relations as the other spinors, and hence we 
can form the generators of O(N + 1) out of the spinors of O(N). As before, we 
can construct the transformation properties of the I’. Let us define: 


U(6") = exp (10 M'/) (2.78) 


where M’/ is constructed out of the spinors I’. 
Then it is easy to show that I" satisfies the following identity: 


Ue U6) = (07 )" er" (2.79) 


which proves that I transforms as a vector. 

Finally, we should also mention that this spinorial representation of the group 
O(N) is reducible. For example, we could have constructed two projection 
operators: 


14+T w+ 
= == 
2 
1-T 
P, a (2.80) 
2 
which satisfy the usual properties of projection operators: 
ip = eee 
P2 = Pp 
PRP, = 0 
Pr+Pr = 1 (2.81) 


With these two projection operators, the group splits into two self-contained pieces. 
(When we generalize this construction to the Lorentz group, these will be called 
the “left-handed” and “right-handed” Wey] representations. They will allow us to 
describe neutrino fields.) 
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2.7 Lorentz Group 


Now that we have completed our brief, warm-up discussion of some compact Lie 
groups, let us tackle the main problem of this chapter, finding the representations 
of the noncompact Lorentz and Poincaré groups. 

We define the Lorentz group as the set of all 4 x 4 real matrices that leave the 
following invariant: 


Invariant: s? =c?t? —x'x! =(x 


OE ae ayy” (2.82) 
The minus signs, of course, distinguish this from the group O(4). 
A Lorentz transformation can be parametrized by: 


x= AMX” (2.83) 


Inserting this transformation into the invariant, we find that the A matrices must 
satisfy: 


iy = MP n8poA” y (2.84) 


which can be written symbolically as g = A’ gA. Comparing this with Eg. (2.11), 
we Say that g,,, is the metric of the Lorentz group. If the signs within the metric 
&uv were all the same sign, then the group would be O(4). To remind ourselves 
that the signs alternate, we call the Lorentz group O(3, 1): 


Lorentz group: O(3, 1) (2.85) 


where the comma separates the positive from the negative signs within the metric. 
In general, an orthogonal group that preserves a metric with M indices of one sign 
and N indices of another sign is denoted O(M, N). 

The minus signs in the metric create an important difference between O(4) 
and the Lorentz group: The invariant distance s* can be negative or positive, 
while the invariant distance for O(4) was always positive. This means that the 
X, plane splits up into distinct regions that cannot be connected by a Lorentz 
transformation. If x and y are two position vectors, then these regions can be 
labeled by the value of the invariant distance s?: 


Gey ell time-like 
(x—y)?=0: _ light-like 


(x—y)? <0: _ space-like (2.86) 


2.7. Lorentz Group 51 


Excluding this crucial difference, the representations of the Lorentz group, for the 
most part, bear a striking resemblance with those of O(4). For example, as before, 
we can introduce the operator L“” in order to define the action of the Lorentz 
group on fields: 


Lie =x" p” — x" pp =aGte — x 3") (2.87) 


where p, = i0,. As before, we can show that this generates the algebra of the 
Lorentz group: 


ee bP Se LE ae aie Lee ee Ye (2.88) 
Let us also define: 
U(A) = exp (ie Liv) (2.89) 
where, infinitesimally, we have: 
Kea 0" eae (2.90) 
Then the action of the Lorentz group on a vector field @* can be expressed as: 
Vector: U(A)p“(x)U7'(A) = (A7')*.6"°(’) (2.91) 


(where U(A) contains an additional piece which rotates the vector index of $“). 
Let us parametrize the A}; of a Lorentz transformation as follows: 


2 
; x+ut ? ; ; t+vx/c 


x = ——_; Sy, 2 =2) 0 = Sel (2.92) 
J 1 — v*/c? ” V1 —v?/c? 


We can make several observations about this transformation from a group 
point of view. First, the velocity v is a parameter of this group. Since the velocity 
varies as 0 < v < c, where v is strictly less than the speed of light c, we say 
the Lorentz group is noncompact; that is, the range of the parameter v does not 
include the endpoint c. This stands in contrast to the group O(N), where the 
parameters have finite range, include the endpoints, and are hence compact. 

We say that the three components of velocity v*, v”, v’ are the parameters of 
Lorentz boosts. If we make the standard replacement: 


y= Snes =cosh¢, By=sinh¢; B=v/c (2.93) 


nl — U2 76" 
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then this transformation can be written as: 


a cosh@ sinhd 0 0 ae 
1 “ 0 0 1 
c _ sinh@ cosh¢d _ (2.94) 
See 0 0 ows x 
x? 0 0 0 1 x3 
Let us rewrite this in terms of M“”. Let us define: 
Pie J ciik mak 
2 
Ki = M® (2.95) 
Written out explicitly, this becomes: 
0 1 0 0 
Peo 0 0 
KK ==] (2.96) 
0 @ O @ 
0 © OC 
Then this Lorentz transformation can be written as: 
eK" = cosh +i sinh oK* (2.97) 


Similarly, Lorentz boosts in the y and z direction are generated by exponenti- 
ating K” and K?: 


001 0 ea 

5 | 0 CBome fuel 0) Or s0 20 
KY = K? =-i po Keak = =: (2.98) 

100 0 Oe0an0 0 

000 0 io oe 


Unfortunately, it is easily checked that a boost in the x direction, followed 
by a boost in the y direction, does not generate another Lorentz boost. Thus, the 
three K matrices by themselves do not generate a closed algebra. To complete the 
algebra, we must introduce the generators of the ordinary rotation group O(3) as 


2.7. Lorentz Group aX) 


well: 
oo 0. 0 0 00: =O 
oo 0 0 = 
poe ee , epneegeeny Os OmO 1 
vos 0 1 Oe 0 6) 0 
0 0 -1 0 O° 1.0 “O 
(2.99) 
OP OF Om) 
5 Oe Oe ae 
= = (2.100) 
0 -1 0 0 
0 0 0 0 
The K and J matrices have the following commutation relations: 
[K',K/] = —ie'l*j* 
Pere ants! 08 fe 
blacked] = sie KF (2.101) 


Two pure Lorentz boosts generated by K' taken in succession do not generate 
a third boost, but must generate a rotation generated by J’ as well. (Physically, 
this rotation, which arises after several Lorentz boosts, gives rise to the Thomas 
precession effect.) 

By taking linear combinations of these generators, we can show that the 
Lorentz group can be split up into two pieces. We will exploit the fact that 
SO(4) = SU(2) ® SU(2), and that the algebra of the Lorentz group is similar to 
the algebra of SO(4) (modulo minus signs coming from the metric). By taking 
linear combinations of these generators, one can show that the algebra actually 
splits into two pieces: 


Ab = 50! +R) 
Bi = aU! ~iK') (2.102) 


We then have [A’, B/] = 0, so the algebra has now split up into two distinct pieces, 
each piece generating a separate SU(2). 

If we change the sign of the metric so that we only have compact groups, 
then we have just proved that the Lorentz group, for our purposes, can be written 
as SU(2) @ SU(2). This means that irreducible representations (/) of SU(2), 
where j = 0, 1/2, 1, 3/2, etc., can be used to construct representations of the 
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Lorentz group, labeled by (j, j’) (see Appendix). By simply pairing off two 
representations of SU(2), we can construct all the representations of the Lorentz 
group. (The representation is spinorial when j +,’ is half-integral.) In this fashion, 
we can construct both the tensor and spinor representations of the Lorentz group. 

We should also remark that not all groups have spinorial representations. 
For example, GL(N), the group of all real N x N matrices, does not have any 
finite-dimensional spinorial representation. (This will have a great impact on the 
description of electrons in general relativity in Chapter 19.) 


2.8 Representations of the Poincaré Group 


Physically, we can generalize the Lorentz group by adding translations: 
a = KY x gt (2.103) 


The Lorentz group with translations now becomes the Poincaré group. Because 
the Poincaré group includes four translations in addition to three rotations and 
three boosts, it is a 10-parameter group. In addition to the usual generator of the 
Lorentz group, we must add the translation generator p,, = 10,. 

The Poincaré algebra is given by the usual Lorentz algebra, plus some new 
relations: 


[Luvs Pol = ~igyp Pv +igvpPy 


ll 


(eer, | 0 (2.104) 
These relations mean that two translations commute, and that translations trans- 
form as a genuine vector under the Lorentz group. 

To find the irreducible representations of a Lie group, we will often use the 
technique of simultaneously diagonalizing a subset of its generators. Let the rank 
of a Lie group be the number of generators that simultaneously commute among 
themselves. The rank and dimension of O(N) and SU(N) are given by: 


(2.105) 
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For example, the group S O(3) has rank I, so we can choose L; to be the generator 
to diagonalize. The group $O(4), as we saw, can be re-expressed in terms of 
two SU(2) subgroups, so that there are two generators that commute among 
themselves. 

In addition, we have the Casimir operators of the group, that is, those operators 
that commute with all the generators of the algebra. For the group O(3), we know, 
for example, that the Casimir operator is the sum of the squares of the generators: 
L?. Therefore, we can simultaneously diagonalize L? and L3. The representations 
of SO(3) then correspond to the eigenvalues of these two operators. 

For the Poincaré group, we first know that P? = m?, or the mass squared, is 
a Casimir operator. Under Lorentz transformations, it transforms as a genuine 
scalar, and is hence invariant. Also, it is invariant under translations because all 
translations commute. 

To find the other Casimir operator, we introduce: 


1 
Wis lee Py Loo (2.106) 


which is called the Pauli-Lubanski tensor (where €°!7 = +1). Then the square of 
this tensor is a Casimir operator as well: 


Casimir operators = { P?, Wit (2.107) 


All physical states in quantum field theory can be labeled according to the eigen- 
value of these two Casimir operators (since the Casimir commutes with all gen- 
erators of the algebra). However, the physical significance of this new Casimir 
operator is not immediately obvious. It cannot correspond to spin, since our intu- 
itive notion of angular momentum, which we obtain from nonrelativistic quantum 
mechanics, is, strictly speaking, lost once we boost all particles by a Lorentz 
transformation. The usual spin operator is no longer a Casimir operator of the 
Lorentz group. 

To find the physical significance of Wie let us therefore to go the rest frame 
of a massive particle: P“ = (m,0). Inserting this into the equation for the 
Pauli—Lubanski tensor, we find: 


W; 


1 
—smeijnod” 
= —mL; 


Wa =a (2.108) 


where L; is just the usual rotation matrix in three dimensions. Thus, in the rest 
frame of a massive particle, the Pauli-Lubanski tensor is just the spin generator. 
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Its square is therefore the Casimir of SO (3), which we know yields the spin of the 
particle: 


W? =m’s(s +1) (2.109) 


where s is the spin eigenstate of the particle. 

In the rest frame of the massive particle, we have 2s + 1 components for a 
spin-s particle. This corresponds to a generalization of our intuitive understanding 
of spin coming from nonrelativistic quantum mechanics. 

However, this analysis is incomplete because we have not discussed massless 
particles, where Pe = 0. In general, the counting rule for massive spinning states 
breaks down for massless ones. For these particles, we have: 


W2\p) = W,P#|p) = Py P*|p) =0 (2.110) 


The only way to satisfy these three conditions is to have W,, and P” be proportional 
to each other; that is, W,,|p) = hP,|p) = 0 ona massless state |p). This number 
h is called the helicity, and describes the number of independent components of a 
massless state. 

Using the definition of W,, for a massless state, we can show that h can be 
written as: 


Big 


“= PL 


(2.111) 


Because of the presence of e“”°° in the definition of W,,, the Pauli-Lubanski 
vector is actually a pseudovector. Under a parity transformation, the helicity h 
therefore transforms into —h. This means that massless states have two helicity 
states, corresponding to a state where W,, is aligned parallel to the momentum 
vector and also aligned antiparallel to the momentum vector. Regardless of the 
spin of a massless particle, the helicity can have only two values, h and —h. There 
is thus an essential difference between massless and massive states in quantum 
field theory. 

It is quite remarkable that we can label all irreducible representations of the 
Poincaré group (and hence all the known fields in the universe) according to the 
eigenvalues of these Casimir operators. A complete list is given as follows in 
terms of the mass m, spin s, and helicity h: 


Poe On aes), 8 SON 201372, --- 
P=0: Gh), tests 

; : CAD) 
Pree s continuous 


P? <0:  tachyon 
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In nature, the physical spectrum of states seems to be realized only for the first 
two categories with P® > 0. The other states, which have continuous spins or 
tachyons, have not been seen in nature. 


2.9 Master Groups and Supersymmetry 


Although we have studied the representations of O(3, 1) by following our dis- 
cussion of the representations of O(4), the two groups are actually profoundly 
different. One nontrivial consequence of the difference between O(4) and O(3, 1) 
is summarized by the following: 


No-Go Theorem: There are no finite-dimensional unitary representations of non- 
compact Lie groups. Any nontrivial union of the Poincaré and an internal group 
yields an S matrix which is equal to 1. 


This theorem has caused a certain amount of confusion in the literature. In the 
1960s, after the success of the §U (3) description of quarks, attempts were made to 
construct Master Groups that could nontrivially combine both the Poincaré group 
and the “internal” group SU (3): 


Master group > P ® SU(3) @At3) 


In this way, it was hoped to give a unified description of particle physics in terms 
of group theory. There was intense interest in groups like SU(6, 6) or U(12) 
that combined both the internal and space-time groups. Only later was the no-go 
theorem discovered, which seemed to doom all these ambitious efforts. Because 
of the no-go theorem, unitary representations of the particles were necessarily 
infinite dimensional: These groups possessed nonphysical properties, such as an 
infinite number of particles in each irreducible representation, or a continuous 
spectrum of masses for each irreducible representation. As a consequence, after 
a period of brief enthusiasm, the no-go theorem doomed all these naive efforts to 
build a Master Group for all particle interactions. 

Years later, however, it was discovered that there was a loophole in the no-go 
theorem. It was possible to evade this no-go theorem (the most comprehen- 
sive version being the Coleman—Mandula theorem) because it made an implicit 
assumption: that the parameters 6; of the Master Group were all c numbers. 
However, if the 6; could be anticommuting, then the no-go theorem could be 
evaded. 

This leads us to the super groups, and eventually to the superstring, which 
have revived efforts to build Master Groups containing all known particle inter- 
actions, including gravity. Although supersymmetry holds the promise of being a 
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fundamental symmetry of physics, we will study these theories not because they 
have any immediate application to particle physics, but because they provide a 
fascinating laboratory in which one can probe the limits of quantum field theory. 

In summary, the essential point is that quantum field theory grew out of the 
marriage between quantum mechanics and group theory, in particular the Lorentz 
and Poincaré group. In fact, it is one of the axioms of quantum field theory that 
the fundamental fields of physics transform as irreducible representations of these 
groups. Thus, a study of group theory goes to the heart of quantum field theory. 
We will find that the results of this chapter will be used throughout this book. 


2.10 Exercises 


1. By a direct calculation, show that M'/ given in Eq. (2.68) and the spinor 
representation given in Eq. (2.76) do, in fact, satisfy the commutation relations 
of the Lorentz algebra in Eq. (2.88) if we use the Lorentz metric instead of 
Kronecker delta functions. 


2. Prove that, under a proper orthochronous Lorentz transformation (see Ap- 
pendix for a definition), the sign of the time ¢ variable (if we are in the 
forward light cone) does not change. Thus, the sign of ¢ is an invariant in the 
forward light cone, and we cannot go backwards in time by using rotations 
and proper orthochronous Lorentz boosts. 


3. Show that the proper orthochronous Lorentz group is, in fact, a group. Do the 
other branches of the Lorentz group also form a group? 


4. For O(3), show that the dimensions of the irreducible tensor representations 
are all positive odd integers. 


5. For O(3), show that: 
30383=76565630630361 (2.114) 


(see Appendix). 
6. Prove that: 


e*F (Bye =F (ce Be“) (2.115) 


where A and B are operators, and F is an arbitrary function. (Hint: use a 
Taylor expansion of F.) 


7. Prove Eq. (2.24). 
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8. 
9: 


1G: 


Lie 


122 


ee 


16. 


Prove that the Pauli—Lubanski vector is a genuine Casimir operator. 


Prove that an element of a Clifford algebra I transforms as a vector under 
the Lorentz group; that is, verify Eq. (2.79). 


For S$ O(2), show that the spin eigenvalue can be continuous. (Hint: examine 
exp(i@@) under a complete rotation in 6 if a is fractional.) 


Prove that the Lorentz group can be written as SL(2, C), the set of complex, 
2 x 2 matrices with unit determinant. Show that this representation (as well as 
the other representations we have discussed in this chapter) is not unitary, and 
hence satisfies the no-go theorem. Unitarity, however, is required to satisfy 
the conservation of probability. Does the nonunitarity of these representations 
violate this important principle? 


In three dimensions, using the contraction of two €'/* constant tensors into 
Kronecker deltas in Eq. (2.53), prove that the curl of the curl of a vector A is 
given by V7A — V(V - A). 


Prove that SU (4) is locally isomorphic to $ O(6). (Hint: show the equivalence 
of their Lie algebras.) 


. Prove that there are two constants, 5‘! and «'/*’"", which transform as genuine 


tensors under § O(N). To prove these constants are genuine tensors, act upon 
them with O. Show that the tensor equation for 5 reduces to the definition 
of an orthogonal group. Prove that €'/*’""" satisfies: 


ON O22... ONIN eA IN = A eile in (2.116) 


What is the constant A? 


. Prove that: 


L2.: 


pv? 


aad Oe (2.117) 


are Casimir operators for the Lorentz group (but are not Casimir operators for 
the full Poincaré group). 


Re-express: 
citiarin edidarin (2.118) 


entirely in terms of delta functions for N = 4 and 5. (Hint: check that the 
antisymmetry properties of the e“”’” tensor are ‘Satisfied for the product of 
delta functions.) 


Chapter 3 
Spin-0 and 5 Fields 


It is more important to have beauty in one’s equations than to have them 
fit experiment... because the discrepancy may be due to minor features 
that are not properly taken into account and that will get cleared up with 
further developments of the theory.... It seems that if one is working 
from the point of view of getting beauty in one’s equations, and if one has 


really a sound insight, one is ona sure line of progress. 
—P.A.M. Dirac 


3.1 Quantization Schemes 


In the previous chapters, we presented the classical theory of fields and also the 
symmetries they obey. In this chapter, we now make the transition to the quantum 
theory of fields. 

Symbolically, we may write: 


lim Quantum mechanics = Quantum field theory (3.1) 


N—-co 


where N is the number of degrees of freedom of the system. We will see that one 
important consequence of this transition is that quantum field theory describes 
multiparticle states, while ordinary quantum mechanics is based on a single- 
particle interpretation. We will find that second quantized systems are ideally 
suited to describing relativistic physics, since relativity introduces pair creation 
and annihilation and hence inevitably introduces multiparticle states. 

In this chapter, we will develop the second quantization program for the 
irreducible representations of the Lorentz group for fields with spin 0 and 5. We 
stress, however, that a number of different types of quantization schemes have 
been proposed over the decades, each with their own merits and drawbacks: 
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. The most direct method is the canonical quantization program, which we 


will develop in this chapter. Canonical quantization closely mimics the de- 
velopment of quantum mechanics; that is, time is singled out as a special 
coordinate and manifest Lorentz invariance is sacrificed. The advantage of 
canonical quantization is that it quantizes only physical modes. Unitarity of 
the system is thus manifest. At the level of QED, the canonical quantization 
method is not too difficult, but the canonical quantization of more complicated 
theories, such as non-Abelian gauge theories, is often prohibitively tedious. 


. The Gupta—Bleuler or covariant quantization method will also be mentioned 


in this chapter. Contrary to canonical quantization, it maintains full Lorentz 
symmetry, which is a great advantage. The disadvantage of this approach is 
that ghosts or unphysical states of negative norm are allowed to propagate 
in the theory, and are eliminated only when we apply constraints to the state 
vectors. 


. The path integral method is perhaps the most elegant and powerful of all 


quantization programs. One advantage is that one can easily go back and 
forth between many of the other quantization programs to see the relation- 
ships between them. Although some of the conventions found in various 
quantization programs may seem a bit bizarre or contrived, the path integral 
approach is based on simple, intuitive principles that go to the very heart of 
the assumptions of quantum theory. The disadvantage of the path integral 
approach is that functional integration is a mathematically delicate operation 
that may not even exist in Minkowski space. 


. The Becchi—Rouet-Stora—Tyupin (BRST) approach is one of the most con- 


venient and practical covariant approaches used for gauge theories. Like the 
Gupta-Bleuler quantization program, negative norm states or ghosts are al- 
lowed to propagate and are eliminated by applying the BRST condition onto 
the state vectors. All the information is contained in a single operator, making 
this a very attractive formalism. The BRST approach can be easily expressed 
in terms of path integrals. 


. Closely related to the BRST method is the Batalin—Vilkovisky (BV) quan- 


tization program, which has proved powerful enough to quantize the most 
complicated actions so far proposed, such as those found in string and mem- 
brane theories. The formalism is rather cumbersome, but it remains the only 
program that can quantize certain complex actions. 


. Stochastic quantization is yet another quantization program that preserves 


gauge invariance. One postulates a fictitious fifth coordinate, such that the 
physical system eventually settles down to the physical solution as the fifth 
coordinate evolves. 
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3.2 Klein—Gordon Scalar Field 


Let us begin our discussion by quantizing the simplest possible relativistic field 
theory, the free scalar field. The theory was proposed independently by six 
different physicists.!~° The Lagrangian is given by: 


ee | l 
% = <(.g) — 5m (3.2) 


Historically, the quantization of the Klein-Gordon equation caused much 
confusion. Schrédinger, even before he postulated his celebrated nonrelativistic 
equation, considered this relativistic scalar equation but ultimately discarded it 
because of problems with negative probability and negative energy states. In 
any fully relativistic equation, we must obey the “mass-shell condition” pe. = 
E* — p* = m?. This means that the energy is given by: 


E=+,/p? +m? (3.3) 


The energy can be negative, which is quite disturbing. Even if we banish the 
negative energy states by fiat, we find that interactions with other particles will 
reduce the energy and create negative energy states. This means that all positive 
energy states will eventually collapse into negative energy states, destabilizing 
the entire theory. One can show that even if we prepare a wave packet with only 
states of positive energy, interactions will inevitably introduce negative energy 
states. We will see, however, that the solution of these problems with negative 
probability and negative energy can be resolved once one quantizes the theory. 

The canonical quantization program begins with fields @ and their conju- 
gate momentum fields 2, which satisfy equal time commutation relations among 
themselves. Then the time evolution of these quantized fields is governed by a 
Hamiltonian. Thus, we closely mimic the dynamics found in ordinary quantum 
mechanics. We begin by singling out time as a special coordinate and then defining 
the canonical conjugate field to ¢: 


64 : 
=> = sf 3.4 
7 (Xx, t) 500, G(X, t) (3.4) 
We can introduce the Hamiltonian as: 
KH=no-F= ; [x2 +(Vo) +m’o"] (3.5) 
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Then the transition from classical mechanics to quantum field theory begins when 
we postulate the commutation relations between the field and its conjugate mo- 
mentum: 


[(x, t), xy, t)] = 18° (x — y) (3.6) 


(The right-hand side is proportional to 4, which we omit. This is the point 
where the quantum principle begins to emerge from the classical theory.) All 
other commutators (i.e., between 2 and itself, and @ and itself) are set equal to 
zero. [Although this expression looks non-relativistic, notice that x“ and y” are 
separated by a space-like distance, (x — y)? < 0, which is preserved under a 
Lorentz transformation. ] 

Much of what follows is a direct consequence of this commutation relation. 
There are an infinite number of ways in which to satisfy this relationship, but 
our strategy will be to find a specific Fourier representation of this commutation 
relation in terms of plane waves. When these plane-wave solutions are quantized 
in terms of harmonic oscillators, we will be able to construct the multiparticle 
Hilbert space and also find a specific operator representation of the Lorentz group 
in terms of oscillators. 

We first define the quantity: 


k-x =k,x" = (Et —p-x) (3.7) 


We want a decomposition of the scalar field where the energy k° is positive, and 
where the Klein—Gordon equation is explicitly obeyed. In momentum space, the 
operator 0? + m becomes k? — m?. Therefore, we choose: 


b(x) = i d*k 5(k? — m)0(ko) [Ae * + AT(Ke™*] (3.8) 


(Qn amy 


where @ is a step function [6(ko) = +1 if ky > O and @(ko) = O otherwise], and 
where A(k) are operator-valued Fourier coefficients. It is now obvious that this 
field satisfies the Klein—Gordon equation. If we hit this expression with (a2 +m°*), 
then this pulls down a factor of k? — m?, which then cancels against the delta 
function. 

We can simplify this expression by integrating out dk® (which also breaks 
manifest Lorentz covariance). To perform the integration, we need to re-express 
the delta function. We note that a function f(x), which satisfies f(a) = 0, obeys 
the relation: 


d(x — a) 


LO ra 


(3.9) 
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for x near a. (To prove this relation, simply integrate both sides over x, and then 
change variables. This generates a Jacobian, which explains the origin of f’.) 
Since k? = m” has two roots, we find: 


_ 6(ko = V2 +m?) 5(k° + VK? +m?) 
7 2k9 2|k| 


2 


5(k* — m*) (3.10) 


Putting this back into the integral, and using only the positive value of k°, we find: 


4 Zi 2 3 “ dk® 
d*k 8(k? — m?)6(ko) fae / 5 (Ko ~ J +m?) 
0 


2K 


3 
See oy See (3.11) 


2a, ‘ 


Now let us insert this expression back into the Fourier decomposition of (x). We 
now find: 


dk 
ae Sales 312 
p(x) lt je Fale] (3.12) 
= / d°k [ak)ex(x) +a! (ke | (3.13) 
n(x) = i d°k iwx [—atk)ex + a' (ke; | (3.14) 
where: 
ea ik-x 
e.(x) = (3.15) 


where A(k) = \/2a@,a(k) and where k° appearing in k - x is now equal to a. We 
can also invert these relations, solving for the Fourier modes a(k) in terms of the 
original scalar field: 


Ae ee / Px E(x) 30 (x) 


ae = es i dx e(x) 80 O(2) 
(3.16) 


where: 


A 9 B=AQB—(0A)B (3.17) 
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Because the fields satisfy equal-time canonical commutation relations, the Fourier 
modes must also satisfy commutation relations: 


[a(k), at (k’)] = 8°(k — k’) (3.18) 
and all other commutators are zero. To prove that this commutation relation is 


consistent with the original commutator, let us insert the Fourier expansion into 
the equal-time commutator: 


/ ico dk’ ik: 
al 3 , t oe k Aes 

[Sane ere ls a (Qny>20y [a : 

Pal(ke*=4a@ eu et al(kyel*"*| 


d>k’ 


aa 


x &(k —k’) (Cah A eh) 


3 


= ont) (3.19) 


Thus, this is a consistent choice for the commutators. Now we can calculate the 
Hamiltonian in terms of these Fourier modes: 


1 
i i d°x [x +. 0;63;¢ + mo" | 


= 3 / dk a [a(k)at(k) + al (Katk)] 


1 
i ark wx E (k)a(k) + | (3.20) 
Similarly, we can calculate the momentum P: 
P= = / mVod>x 


a i; ak k [at (kak) + a(k)a'(k)] 
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2 3 ' oy 
= a-k k} a! (k)atk) + ) (3.21) 


(We caution that both the energy and momentum are actually divergent because 
of the factor 5 appearing in the infinite sum. We will clarify this important point 
shortly.) 

With these expressions, it is now easy to check that the operators P“ and M+” 
generate translations and Lorentz rotations, as they should: 


[Pus $] 
i[M*”, $] 


aud 
(x"0” — x’d")o G22) 


If we exponentiate the generators of translations and Lorentz rotations, we can 
calculate how the field @(x) transforms under the Poincaré group. Let us define: 


U(A, a) = exp (~5éum 4 ia,P*) (3.23) 


where A, = Suv + €yv +--+. Then it is straightforward to show: 
U(A, a)(x)U~!(A, a) = @(Ax +a) (3.24) 


This demonstrates that @(x) transforms as a scalar field under the Poincaré group. 

Now that we have successfully shown how to quantize the Klein—Gordon field, 
we must now calculate the eigenstates of the Hamiltonian to find the spectrum of 
states. Let us now define the “vacuum” state as follows: 


a(k)|0) =0 (3.25) 


By convention, we call a(k) an “annihilation” operator. We define a one-particle 
state via the “creation” operator as a Fock space: 


a'(k)|0) = |k) (3.26) 


The problem with this construction, however, is that the energy associated 
with the vacuum state is formally infinite because of the presence of 1/2 in the 
sum in Eq. (3.20). We will simply drop this infinite term, since infinite shifts 
in the Hamiltonian cannot be measured. Dropping the zero point energy in the 
expression for harmonic oscillators has a simple counterpart in x space. The zero- 
point energy emerged when we commuted creation and annihilation operators past 
each other. Dropping the zero-point energy is therefore equivalent to moving all 
creation operators to the left and annihilation operators to the right. This operation, 
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in x space, can be accomplished by “normal ordering.” Since the product of two or 
more fields at the same point is formally divergent, we can remove this divergence 
by the normal ordering operation, which corresponds to moving the part containing 
the creation operators to the left and the annihilation operators to the right. If we 
decompose ¢ = ¢* + @~ , where — (+) represent the creation (annihilation) part of 
an operator with negative (positive) frequency, then we define: 


: hid2 = $19} +o, $3 +O, b) +5 OF (3.27) 


Then, by applying the normal ordering to the definition of the Hamiltonian, we can 
simply drop the - appearing in Eq. (3.20). From now on, we assume that when 
two fields are multiplied at the same point in space-time, they are automatically 
normal ordered. 

Once we have normal ordered the operators, we now have an explicitly positive 
Hamiltonian. In this fashion, we have been able to handle the question of negative 
energy states for the Klein—Gordon theory. (More subtleties concerning negative 
energy states will be discussed when we analyze the Dirac equation.) 

One essential point in introducing these creation and annihilation operators is 
that we can write down the N-particle Fock space: 


[k1, ko, +++, kw) = al(ky)al(k)---at(ky)|0) (3.28) 


This is the chief distinguishing feature between first and second quantization. In 
first quantization, we quantized the x; corresponding to a single particle. First 
quantized systems were hence inherently based on single-particle dynamics. In 
the second quantized formalism, by contrast, we quantize multiparticle states. 
To count how many particles we have of a certain momentum, we introduce the 
“number” operator: 


N= i d*k a‘ (k)a(k) (3.29) 


The advantage of this number operator is that we can now calculate how many 
particles there are of a certain momentum. For example, let |n(k)) equal a state 
consisting of n(k) identical particles with momentum k: 


* at(ky 
|n(k)) = “Jatt (3.30) 


It is easy to show [by commuting a(k) to the right, until they annihilate on the 
vacuum] that: 


N|n(k)) = n(k)|\n(k)) (3.31) 
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that is, N simply counts the number of states there are at momentum k. Not sur- 
prisingly, a multiparticle state, consisting of many particles of different momenta, 
can be represented as a Fock space: 


(at (ki (2'&)™ 9 


\n(ki n(ka) ---n(km)) =| Tay 


T—" 


(3.32) 


Then the number operator N acting on this multiparticle state just counts the 
number of particles present: 


m 


N|n(k,)n(k2) +++ n(kin)) = (x a) |n(k)n(k2) ++ -n(km)) (3.33) 


i=] 


Finally, it is essential to notice that the norm of these multiparticle states is 
positive. If we define (k| = (O|a(k) and set (0|0) = 1, then the norm is given by 
(k|k’) = +6°(k — k’). The norm is positive because the appropriate sign appears 
in the commutation relation, Eq. (3.18). If the sign of the commutator had been 
reversed and the norm were negative, then we would have a negative norm state, 
or “ghost” state, which would give us negative probabilities and would violate 
unitarity. (For example, we would not be able to write the completeness statement 
1 = 5°, |) (n| which is used in unitarity arguments.) To preserve unitarity, it is 
essential that a physical theory be totally free of ghost states (or that they cancel 
completely). We will encounter this important question of ghosts repeatedly 
throughout this book. 


3.3. Charged Scalar Field 


We can generalize our discussion of the Klein—Gordon field by postulating the 
existence of several scalar fields. In particular, we can arrange two independent 
scalar fields into a single complex field: 


1 
p= vain +i¢2) (3.34) 

The action then becomes: 
F=3,6'd"b — m'o'o (3.35) 


If we insert this decomposition into the action, then we find the sum of two 
independent actions for @; and ¢>. 
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The quantization of this action proceeds as before by calculating the conjugate 
field and postulating the canonical commutation relations. The conjugate field is 
given by: 


bf, 
=—— =@/ 3.36 
1 5p Wy) (3.36) 
The commutation relations now read: 
(P(x, t), xy, t)] = 18°(x — y) (3.37) 


We can always decompose this field in terms of its Fourier components: 


ai(ke“** +a} (Kel**) (3.38) 


oc= [ ——( 
Onan 


Then the canonical commutation relations can be satisfied if the Fourier compo- 
nents obey the following commutation relations: 


[a;(k), al (k’)] = 8k — k’')8;; (3.39) 


All other commutators vanish. We could also choose the decomposition: 


1 ] 
ak) = lak) tial]; ale) = [al& —ia} | 
1 1 
bk) = lank) — tax; bik) = = [alih) + ia} 0)] (3.40) 


For these operators, the new commutation relations read: 
[a(k), at(k’)] = [b(k), D'(K)] = 8k — k’) (3.41) 


All other commutators are zero. Now let us construct the symmetries of the 
action and the corresponding Noether currents. The action is symmetric under the 
following transformation: 


¢—e°¢, dle ¢l (3.42) 


which generates a U(1) symmetry. Written out in components, we find, as in the 
previous chapter, the following § O(2) transformation: 


?) _{ cos@ —siné 1 
( $5 )-( sind cos@ ( 2 ea) 
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This symmetry generates a Noether current, which equals: 
Jn =i! d.6 — id, g' (3.44) 


Now let us calculate the charge Q corresponding to this current in terms of 
the quantized operators: 


Q i d°x i(o'd — o'¢) 


w i d°k [a* (k)a(k) — bE] 


= N= Ni (3.45) 


where the number operator for the a and b oscillator is given by: 
n= / dkat(k)a(k); Ny = / d°k b'(k)b(k) (3.46) 


Historically, this conserved current caused a certain amount of confusion. 
If J° is considered to be the probability density of the wave function, then it 
can be negative, and hence negative probabilities creep into the theory. In fact, 
Schrédinger originally studied this equation as a candidate for the theory of the 
electron but abandoned it because of these negative probabilities. As a conse- 
quence, he later went on to write another equation that did not suffer from this 
problem, the celebrated nonrelativistic Schrédinger equation. 

However, in 1934 Pauli and Weisskopf’ finally gave the correct quantum 
interpretation of these negative probabilities. 

First, because of the crucial minus sign appearing in front of the b oscillators, 
we will find it convenient to redefine the current J,, as the current corresponding 
to the electric charge, rather than probability density, so that the a oscillators 
correspond to a positively charged particle and the b oscillators correspond to 
a negatively charged one. In this way, we can construct the quantum theory 
of charged scalar particles, where the minus sign appearing in the current is a 
desirable feature, rather than a fatal illness of the theory. In the next chapter, we 
will show how to couple this theory to the Maxwell field and hence rigorously 
show how this identification works. 

Second, we will interpret the b! oscillator as the creation operator for a new 
state of matter, antimatter. It was Dirac who originally grappled with these new 
states found in any relativistic theory and deduced the fact that a new form of 
matter, with opposite charge, must be given serious physical consideration. The 
discovery of the antielectron gave graphic experimental proof of this conjecture. 
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Third, we no longer have a simple, single-particle interpretation of the $(x), 
which now contains both the matter and antimatter fields. Thus, we must abandon 
the strict single-particle interpretation for ¢ and reinterpret it as a field. We will 
see this unusual feature emerging again when we discuss the Dirac equation. 


3.4 Propagator Theory 


Now that we have defined the canonical commutation relations among the particle 
fields, we are interested in how these particles actually move in space-time. To 
define a propagator, and also anticipate interactions, let us modify the Klein— 
Gordon equation to include a source term J (x): 


(82 + m)(x) = J(x); 7, = 3,0" (3.47) 


To solve this equation, we use the standard theory of Green’s functions. We first 
define a propagator that satisfies: 


(02 + m?)Ar(x ~ y) = —8*(x — y) (3.48) 


Then the solution of the interacting ¢ field is given by: 


62) = do(x) — / IG = De) (3.49) 


where $o(x) is any function that satisfies the Klein—Gordon equation without any 
source term. If we hit both sides of this expression with (a7 +m’), then we find 
that it satisfies the original Klein—Gordon equation in the presence of a source 
term. As we know from the theory of Green’s functions, the way to solve this 
equation is to take the Fourier transform: 


Oke 7 
Ar — y)= i One Art) (3.50) 


If we hit both sides of this equation with (97 + m7), then we can solve for A(k): 


1 
Ar(k) = (OMe pl (3.51) 


At this point, however, we realize that there is an ambiguity in this equation. The 
integral over d*k cannot be performed on the real axis, because the denominator 
diverges at k? = m*. This same ambiguity, of course, occurs even in the classical 
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Figure 3.1. Contour integration for the Green's function. The contour on the left gives us 
the nonrelativistic retarded Green’s functions, while the contour on the right gives us the 
Feynman prescription for a relativistic Green’s function. 


theory of wave equations, and is not specific to the Lorentz covariant theory. The 
origin of this ambiguity lies not in the mathematics, but in the physics, in the fact 
that we have yet to fix our boundary conditions. 

For example, consider the Green’s function for the Schrédinger equation: 


(‘5 = Hy) Go(x — x’) =8*(x — x’) (3.52) 


If we take the Fourier transform of this equation and solve for the Green’s function, 
we find: 


d* p 1 


—ip(x—x’) 
(27) @ — p?/2m* =) 


Go(x — x’) = 


where p“ = (w, p). This expression also suffers from an ambiguity, because the 
integration over w is divergent. 

Let us take the convention that we integrate over the real w axis as in Figure 3.1, 
so that we integrate above the singularity. This can be accomplished by inserting 
a factor of ic into the denominator, replacing w — p*/2m with w — p?/2m + ie. 
Then the w integration can be performed. 

We simply convert the integration over the real axis into a contour integration 
over a complex variable w. We are going to add the contour integral over the 
upper half plane (which vanishes) such that: 


a f° doce 
__ od ip:-(x—x ) 
Deine) (anys [. (2) » — p?/2m +ie 


3 aes ! 
-i f Gpsere )—i(p? /2m)\(t—t a(t st) (3.54) 
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where the @ function has been written as: 


fore) e7iwt—t’) do 
— lim —— —_—_——— 
e>027i J_., tie 


iH 


6(t —t') 


Leeiresy 
(3.55) 


0 otherwise 


which equals | for t > t’, and vanishes otherwise. (To prove this last relation, 
extend the contour integral into a semicircle in the complex w plane, closing the 
contour in the lower half plane when ¢ > t’. Then the contour integral picks up 
the pole at w = —ie.) 

To see this a bit more explicitly, let us define: 


e7 Px 


o,({x) = Qny/ (3.56) 
Then the Green’s function can be written as: 
Go(x — x’) = —io(t —t') / d’p bp(x)b3(x') (3.57) 


In this way, the +/é€ insertion has selected out the retarded Green’s function, 
which obeys the usual concept of causality. Taking the —ie prescription would 
have given us the advanced Green’s function, which would violate causality. 

Finally, taking the dp integration (which is simply a Gaussian integral), we 
find the final result for the Green’s function: 


; m oh im|x — x’ | ; 
Go(x =k ) =-—Il =") exp | Oar O(t —ft’) (3.58) 


which is just the Green’s function found in ordinary quantum mechanics. 

Now that we have seen how various prescriptions for ie give us various 
boundary conditions, let us apply this knowledge to the relativistic case and 
choose the following, unorthodox prescription: 


I 


— m+ ie oe) 


Ar(k) = k2 


To see how this i€ prescription modifies the boundary conditions, we will find it 
useful to decompose this as: 


1 1 1 1 
Sata ae ge ee Bag. —— a 
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The integral over k° now picks up contributions from both terms. Performing the 
integration as before, we find: 


Ar(x — x’) 


Z on ikr—x’) 
id(t — | oer ane 


ik-(x—x’) 
—io(t’ — -» f She 


. 2 d? k *7_/ 
—i0(t —t f On), POP? ) 


—i9(t’ —t) ek * i 3.61 
if Seyi PAS’) B.61) 


We see the rather unusual feature of this prescription: positive energy solutions 
are carried forward in time, but negative energy solutions are carried backwards 
in time. 

In classical physics, the usual solutions of the Maxwell theory give us retarded 
and advanced waves, and we eliminate the advanced waves by a choice of bound- 
ary conditions. However, in the quantum theory we are encountering a new type of 
propagator that, classically, makes no sense, with negative energy solutions going 
backwards in time. This propagator never appears in classical physics because it 
is complex and is hence forbidden. 

Quantum mechanically, negative energy solutions are an inherent problem 
with any relativistic theory. Even if we ban them at the beginning, quantum 
interactions will inevitably re-create them later. However, as we saw in the 
previous section, these negative energy states can be reinterpreted. Feynman’s 
approach to this problem was to assume that these negative energy states, because 
they are going backwards in time, appear as a new form of matter with positive 
energy going forwards in time, antimatter. Although matter going backwards 
in time seems to contradict causality, this poses no problem because one can 
show that, experimentally, a system where matter is going backwards in time is 
indistinguishable (if we reverse certain quantum numbers such as charge) from 
antimatter going forwards in time. For example, an electron placed in an electric 
field may move to the right; however, if it is moving backwards in time, it appears 
to move to the left. However, this is indistinguishable experimentally from a 
positively charged electron moving forwards in time to the left. In this way, 
we can interpret this theory as one in which everything (matter plus antimatter) 
has positive energy. (We will discuss this new reinterpretation further when we 
analyze the Dirac theory.) 

The previous expression for the Green’s function was written in terms of plane 
waves ¢,. However, we can replace the plane wave @, by the quantum field 
(x) if we take the vacuum expectation value of the product of fields. From the 
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previous equation Eq. (3.61), we easily find: 
iAr(x — x’) = (0|T d(x) G(x')|0) (3.62) 
where T is called the time-ordered operator, defined as: 


o(x)b(x’) ift >t’ 


(3.63) 
d(x d(x) if t’>t 


T $(x) G(x") = 


T makes sure that the operator with the latest time component always appears to 
the left. This equation for A is our most important result for propagators. It gives 
us a bridge between the theory of propagators, in which scattering amplitudes are 
written in terms of A(x — x’), and the theory of operators, where everything is 
written in terms of quantum field ¢(x). The previous expression will be crucial to 
our discussion when we calculate the § matrix for QED. 

Finally, we remark that our theory must obey the laws of causality. For our 
purposes, we will define microscopic causality as the statement that information 
cannot travel faster than the speed of light. For field theory, this means that 
(x) and $(y) cannot interact with each other if they are separated by space-like 
distances. Mathematically, this means that the commutator between these two 
fields must vanish for space-like separations. 

Repeating the earlier steps, we can show that this commutator equals: 


(d(x), &(y)] = iA@—y) 
d*k ; 
=f apd — mPetkoyeHO™ 
Jo(mvVt2 — r2) a. 
i 
= ae ae 0 —r<t<r _ (3.64) 


—Jo(mvVt? —r?) t<-r 


where €(k) equals +1(—1) for positive (negative) k, t = x° — y®, r = |x — y|, and 
Jo is the Bessel function. (To prove this, convert the integral to radial coordinates, 
and then perform the k° and |k| integrations.) 

With this explicit form for the commutator, we can easily show that, for 
space-like separations, we have: 


A(x — y)=0 if —yy <0 (3.65) 


This shows that our construction obeys microscopic causality. 
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3.5 Dirac Spinor Field 


After Dirac considered the relativistic theory of radiation in 1927, he set out the 
next year to construct the relativistic theory of electrons. One severe limitation 
was the problem of negative probabilities. He started with the observation that the 
nonrelativistic Schrédinger equation did not have negative probabilities because 
it was linear in time, while the Klein—Gordon equation, being quadratic in time, 
did have negative probabilities. 

Therefore Dirac tried to find a wave equation that was linear in time but still 
satisfied the relativistic mass-shell constraint: 


Pup" = E*? =p =m? (3.66) 


Dirac’s original idea was to take the “square root” of the energy equation. In this 
way, he stumbled onto the spinorial representation of the Lorentz group discussed 
in Chapter 2. He began with a first-order equation: 


is = (—ia; V' + Bm) (3.67) 


where a; and f are now constant matrices, not ordinary c numbers, which act on 
yw, acolumn matrix. 

By squaring the operator in front of the yw field, we want to recover the 
mass-shell condition: 


(-ia-Vi+pmy wv 


— 
1 


(-V2 + my (3.68) 


This is only possible if we demand that the matrices satisfy: 


{oi;,a%} = 25x 
{a;, B} 0 
a? = pr=1 (3.69) 


t 


I 


To make the equations more symmetrical, we can then define y® = Bandy! = Ba’. 
Multiplying the wave equation by £, we then have the celebrated Dirac equation’: 


(iy*d, —m)y =0 (3.70) 
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where the y” matrices satisfy: 
{y*, y"} = 2g” (3.71) 


It is no accident that such a relativistic construction is possible. After all, in 
the previous chapter we studied spinor representations of O(N) in Section 2.6 
by defining a Clifford algebra, which is precisely the algebra formed by the y*. 
Thus, what we are really constructing is the spin 5 representation of the Lorentz 
group, that is, the spinors. 

To calculate the behavior of this equation under the Lorentz group, let us define 
how spinors transform under some representation $(A) of the Lorentz group: 


W(x") = SCA) (x) (3.72) 
Then the Dirac equation transforms as follows: 

[iS~"(A)y" S(A)O, — my = [iy*(A),, 8, — ml = 0 (3.73) 
where we have multiplied the transformed Dirac equation by S~'(A) on the left, 
and we have taken into account the transformed af, = (AN d,. In order for the 
equation to be Lorentz covariant, we must therefore have the following relation: 


S(A)y#S~'(A) =(AT' hy” (3.74) 


which we first encountered in Section 2.6. To find an explicit representation for 
S(A), let us introduce the following matrix: 


i 
Onv = lu YW (3.75) 


In Chapter 2, we saw that (//4)([',,, [,,] are the generators of O(N) in the 
spinor representation. Thus, the o,,,/2 are the generators of the Lorentz group in 
this representation. 

Thus, we can write a new Lorentz group generator that is the sum of the old 
generator L,,, (which acts on the space-time coordinate) plus a new piece that 
also generates the Lorentz group but in the spinor representation: 


1 
My, = Lay ar 9 Cu (3.76) 
The 0,,, also obey the following relation: 


[v", Cup] = 2152 Vp — 56 Ya) (3.77) 
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which shows that the Dirac matrices transform as vectors under the spinor repre- 
sentation of the Lorentz group. 


In terms of this new matrix, we can find an explicit representation of the S(A) 
matrix: 


SQiji=e eee (3.78) 


Now that we know how spinors transform under the Lorentz group, we would like 
next to construct invariants under the group. Let us take the Hermitian conjugate 
of the Dirac equation: 


witiy™ 3, +m) =0 (3.79) 


We will show shortly that there exists a representation of the Dirac matrices that 
satisfies: 


GZ 


(y)t = —-y!' (3.80) 


where y! is anti-Hermitian and y° is Hermitian. This can also be written as 
y“! = y,. (It may be puzzling that we did not take a representation that was 
completely Hermitian. However, as we mentioned earlier, there are no finite- 
dimensional unitary representations of the Lorentz group. If a purely Hermitian 
representation of the Dirac matrices could be found, then we could construct the 
generators of the Lorentz group out of them that would be unitary, violating this 
theorem. Thus, we are forced to take non-Hermitian representations.) 
Now let us define: 


v=yty° (3.81) 


If we hit the conjugated equation of motion with y°, we can replace the y? with 
y matrices, leaving us with: 


Viy"‘a , +m) =0 (3.82) 
Under a Lorentz transformation, this new field wy obeys: 


wixyys(Ayty® 
W(x)S7 (x) (3.83) 


p(x’) 
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This is just what we need to form invariants and covariant tensors. For example, 
notice that ww is an invariant under the Lorentz group: 


Wx!) = WO)S MSA = WOW) (3.84) 
Similarly, wy“ is a genuine vector under the Lorentz group. We find: 
WWE) = OS YS = V@A vy" W@) (3.85) 
where we have used the fact that: 
Say S ea (3.86) 


which is nothing but the statement that the y“ transform as vectors under the 
spinor representation of the Lorentz group, as we saw in Chapter 2. (To prove this 
formula, take an infinitesimal Lorentz transformation. Then S > ayers becomes 
proportional to the commutator between oj, and y”, which is gives just another 
gamma matrix. If we then exponentiate this process for finite transformations, we 
find the previous equation, as desired.) 

In the same manner, it is also straightforward to show that yo" w transforms 
as a genuine antisymmetric second-rank tensor under the Lorentz group. To find 
other Lorentz tensors that can be represented as bilinears in the spinors, let us 
follow our discussion of Chapter 2 and introduce the matrix: 


y=ypaiy’y'y’y? = — two" y’y?y? (3.87) 


where €4"°? = —€,yg, and €°!?3 = +1. Because ys transforms like €“"?”, it is a 


pseudoscalar; that is, it changes sign under a parity transformation. Thus, Wysy 
is a pseudoscalar. 

In fact, the complete set of bilinears, their transformation properties, and the 
number of elements within each tensor are given by: 


Scalar : ww [1] 

Vector: wwytw [4] 

Tensor: yot'w [6] (3.88) 
Pseudovector: wysy4w [4] 
Pseudoscalar: wWysy [1] 
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There is a total of 16 independent components in this table. We can show that the 
following 16 matrices are linearly independent; 


POST, Yas Ops. Vv) (3.89) 


where (I’4)* = +1. To show that this set of 16 matrices forms a complete set, let 
us assume, for the moment, that a relation exists among them, so that: 


oo (3.90) 
A 


where c4 are numbers. Then multiply this by I’? and take the trace. If [8 = /, 
we find that c, = 0. If ©8 + J, then we use the fact that that there exists [¢ not 
equal to unity such that [418 = T° if A 4 B. Taking the trace, we find that 
cg = 0. Since B was arbitrary, this means that all coefficients are zero; so these 
16 matrices must be linearly independent 

Because y” transforms as a vector under the Lorentz group, the following 
Lagrangian is invariant under the Lorentz group: 


Z = Wiy"d, — mys (3.91) 


This, in turn, is the Lagrangian corresponding to the Dirac equation. Variations of 
this equation by w or by w will generate the two versions of the Dirac equation. 

Up to now, we have not said anything specific about the representation of the 
Dirac matrices themselves. In fact, a considerable number of identities can be 
derived for these matrices in four dimensions without ever mentioning a specific 
representation, such as: 


y"Y = 4 
yPy"Yp = —2y" 
Vy) oe 
rye Yo. = =e (3.92) 


Some trace operations can also be defined: 


Tr(y°y") = Tro” =Try“y’y? =0 
Tr(y*y”) = 43” 
Try" y'y2y = Geis — tee 8) 


li 


Tr(yey"y’y?y’) Ne (3.93) 
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In particular, this means: 
Tr(d Bf d)=4[(a-b)(c:d)—(a-c)(6-d)+(a-d)(b-c)] (3.94) 


where ¢ = a,y". 
It is often convenient to find an explicit representation of the Dirac matrices. 
The most common representation of these matrices is the Dirac representation: 


7 | ea ae Oa (3.95) 
oie Oa o 0 , 


where o' are the familiar Pauli spin matrices. Then the spinor ~ is a complex- 
valued field with four components describing a massive, spin - field. 

Now let us try to decompose w(x) into plane waves in order to begin canonical 
quantization. To do this, we need to find a set of independent basis spinors for y. 
We will make the obvious choice: 


u,(0) = ; u2(0)= ; v,(0)= ; v2(0)= 


o oOo °° = 
ly (=) [Ss (ss) 
SSS & 
= So & 


(3.96) 
The trick is to act upon these spinors with S(A) in order to boost them up to 
momentum p. The momentum-dependent spinors are given by: 


Ug(p) = S(A)ug(0) 
Valp) = SGQu70) (3.97) 
which can be shown to obey: 
(vy: p—m)u(p) = 


0 
(vy: ptm)u(p) = 0 
u(py-p—m) = 0 

0 


v(py:-p+m) = (3.98) 


The Lorentz transformation matrix S(A) is not difficult to construct if we set 
all rotations to zero, leaving us with only Lorentz boosts. Then the only generators 
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we have are the K generators, which in turn are proportional to o'. Specifically, 
we have: 


Gay ( cosh(¢/2)  @- nsinh(@/2) 


o-nsinh(?/2) —_ cosh(#/2) 


[/E+m 1 ne 
2m A a 


E+m 


where cosh(¢/2) = [(E + m)/2m]'”? and sinh(@/2) = [((E — m)/2m]"/?. 
Applying S(A) to the independent spinor basis, we easily find: 


1 0 
: E+m 0 E+m 1 
uy(p) = _ fo ualp)= : (3.99) 
2m ae 2m & 
a = 
ale de 
E+m E+m 
E+tm| ~ E+m]| ~= 
vj(p) = oo es ad eG) Ane (3.100) 
2m 1 2m 0 
0 1 


where p+ = px +ipy. 

Because of the particular decomposition we have chosen, the u spinors corre- 
spond to electrons with positive energy particles (moving forwards in time), while 
the v spinors correspond to electrons with negative energy (moving backwards in 
time). 

Next, we would like to describe spinors of definite spin. In many experi- 
ments, we can produce polarized beams of electrons; so it becomes important to 
understand how to incorporate projection operators that can select definite spin. 

This is not as simple as one might suspect, since the intuitive concept of spin is 
rooted in our notion of the rotation group, which is only a subgroup of the Lorentz 
group. Hence, the naive concept of spin and its eigenfunctions no longer applies 
for boosted systems. 

In the rest frame, however, we know that the spin of a system can be described 
by a three-vector s that points in a certain direction; so we may introduce the 
four-vector s,, which, in its rest frame, reduces to s,, = (0, s). Then, by demanding 
that this transform as a four-vector, we can boost this spin vector by a Lorentz 
transformation. Since we define s* = 1, this means that oF = —1. Inthe rest frame, 
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we have p, = (m,0); so we also have p,s* = 0, which must also hold in any 
boosted frame by Lorentz invariance. Thus, we now have two Lorentz-invariant 
conditions on the spin four-vector: 

se = -l 


Lu 
0 (3.101) 


Pus® 


Next, we would like to define a projection operator that selects out states of 
definite spin. Again, we will define the Lorentz-invariant projection operator by 
first examining the rest frame. At rest, we know that the operator @- § serves as 
an operator that determines the spin of a system: 


O-suy(0) = ug(0) 
—Uz(0) (3.102) 


0 - Sv,_(0) 
For a spin 5 system, the projection operator at rest can be written as: 
P(s) = ——— (3.103) 


where the + refers to the u spinor, and the — refers to the v spinor. Our goal is to 
write a boosted version of this expression. Let us define the projection operator: 


1+ 
P(s) = se i (3.104) 
It is easy to show that, in the rest frame, this projection reduces to: 
1/lt+oe-s 0 
P(s)== (3.105) 
2 0 l—o-s 


Therefore, this operator reduces to the previous one, so this is the desired expres- 
sion. The new eigenfunctions now have a spin s associated with them: u(k,s). 
They satisfy: 


Pisintk,s) = wath. s) 
P(setk,s) = vies) 
P(—s)u(k,s) = P(—s)v(k, s)=0 (3.106) 


These spinors are quite useful for practical calculations because they satisfy cer- 
tain completeness relations. Any four-spinor can be written in terms of linear 
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combinations of the four u.(0) and vg(0) because they span the space of four- 
spinors. If we boost these spinors with S(A), then u.(p) and vg(p) span the space 
of all four-spinors satisfying the Dirac equation. 

Likewise, u/(0)vg(0), etc. have 16 independent elements, which in turn span 
the entire space of 4 x 4 matrices. Thus, #g(p)vg(p), etc. span the space of all 
4 x 4 matrices that also satisfy the Dirac equation. 

We first normalize our spinors with the following conventions: 


u(p, S)u(p, Ss) 1 


= (3.107) 


v(p, s)v(p, S) 


With these normalizations, we can show that these spinors obey certain complete- 
ness relations: 


Y- ualP, 5)iig(p, 5) — Va(P, 8)0—(P, 5) = Sap (3.108) 


For the particular representation we have chosen, we find: 


= ptm 1+ V5 y 
Ug(p, S)ug(p, S) = (Gre) (3.109) 
B 2m 2 op 


and: 


(3.110) 


val, S)0_(p. 8) = — (2° id 1p 4) 
a8 


2m 2 


If we sum over the helicity s, we have two projection operators: 


+ 
[Ax(p)lop = Sater siigtn.s)= (GO) 
_ —pt+m 
[A_(P)log = — D_ Valp, s)v .s)=(=3=") (3.111) 
. a dX : 2m ap 


These projection operators satisfy: 
Nea Aas) A A— = 0) geNes AS =) (3.112) 


Because of the completeness relations, A+ has a simple interpretation: It projects 
out the positive or negative energy solution. 
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3.6 Quantizing the Spinor Field 


So far, we have only discussed the classical theory. To second quantize the Dirac 
field, we first calculate the momentum canonically conjugate to the spinor field: 


_ 6.48 eh 
t= 50H) ay G13) 


Let us decompose the spinor field into its Fourier moments: 


dak : : 
= k . k —ikx at k . k ikx 
es iat Bat (kg (ke + dt (Rua ke] 
hee i! jz bt (Kila (Ke + dolk)ig(Ke 
UY, = pa 


(3.114) 


In terms of particles and antiparticles, this particular decomposition gives the 
following physical interpretation: 


(x) = 


b(p)u(p)e~'?* — Annihilates positive energy electron 
(32115) 


d'(p)u(p)e*?* Creates positive energy positron 


(Having d‘ create a positive energy positron can be viewed as annihilating a 
negative energy electron.) 


Repeating the steps we took for the scalar particle, we invert these equations 
and solve for the Fourier moments in terms of the fields themselves: 


ba(k) = | dx UF (x)y W(x) 
bik) = if dx WoxyUE(a) 
d(k) = / Bx POyVeE(x) 
di) = i x Teaver) (3.116) 


where: 


Udx) = || in Poe i 
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Vix) = Pers yutkyelt™ (3.117) 


Now let us insert the Fourier decomposition back into the expression for the 
Hamiltonian: 


= 
| 


fas (xy — F) 


/ d?x Hiv dow) 


/ dk ky) [bi (k)balk) — da(k)dh(k)] (3.118) 


a 


Here we encounter a serious problem. We find that the energy of the Hamiltonian 
can be negative. There is, however, an important way in which this minus sign 
can be banished. Let us define the canonical equal-time commutation relations of 
the fields and conjugate fields with anticommutators, instead of commutators: 


{Wilx, 1), Vj (y, O} = B(x — yj; (3.119) 


In order to satisfy the canonical anticommutation relations, the Fourier moments 
must themselves obey anticommutation relations given by: 


one hk) 
jek) (3.120) 


{ba(k), bt, (k’)} 
{d,(k), d',(k’)} 


Now, if we normal order the Hamiltonian, we must also drop the infinite zero 
point energy, and hence: 


H / dP ko S~ (oh (k)bu(k) + db (K)da(k)] 


he] 
| 


/ Pk D> [bi (k)balk) + di (k)da(k)] (3.121) 


Thus, the use of anticommutation relations and normal ordering nicely solves the 
problem of the Hamiltonian with negative energy eigenvalues. 

Furthermore, the d' operators can be interpreted as creation operators for 
antimatter (or annihilation operators for negative energy electrons). In fact, this 
was Dirac’s original motivation for postulating antimatter in the first place. To see 
how this interpretation of these new states emerges, we first notice that the Dirac 
Lagrangian is invariant under: 


yodhyp, ¢— be tA (3.122) 
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Therefore, there should be a conserved current associated with this symmetry. A 
direct application of Noether’s theorem yields: 


J“ =y"v, 4,J" =0 (3.123) 


which is conserved, if we use the Dirac equation. 

Classically, the conserved charge is positive definite since it is proportional 
to wiy. This was, in fact, an improvement over the classical Klein—Gordon 
equation, where the charge could be negative. However, once we quantize the 
system, the Dirac charge can also be negative. The quantized charge associated 
with this current is given by: 


[exr= [ax wip: 


if dk Yb (dba (ke) — dl (da) (3.124) 


Q 


This quantity can be negative, and hence cannot be associated with the probability 
density. However, we can, as in the Klein—Gordon case, interpret this as the 
current associated with the coupling to electromagnetism; so Q corresponds to the 
electric charge. In this case, the minus sign in Q is a desirable feature, because 
it means that dt is the creation operator of antimatter, that is, a positron with 
opposite charge to the electron. 

Again, this also means that we have to abandon the simple-minded interpre- 
tation of y as a single-electron wave function, since it now describes both the 
electron and the antielectron. The anticommutation relations also reproduce the 
Pauli Exclusion Principle found in quantum mechanics. Because al(k)di = 
only one particle can occupy a distinct energy state with definite spin. Thus, a 
multiparticle state is given by: 


N M 
| Lat. &a [| 2t, & 10) (3.125) 


j=] 


i} 
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with only one particle in any given quantum state. This is the first example of the 
spin-statistics theorem, that field theories defined with integer spin are quantized 
with commutators and are called bosons, while theories with half-integral spins 
are quantized with anticommutators and are called fermions. The existence of 
two types of statistics, one based on commutators (i.e., Bose—Einstein statistics) 
and one based on anticommutators (i.e., Fermi—Dirac), has been experimentally 
observed in a wide variety of physical situations and has been applied to explain 
the behavior of low-temperature systems and even white dwarf stars. Repeating 
the same steps that we used for the Klein—Gordon field, we can also compute 
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the energy-momentum tensor and the angular momentum tensor from Noether’s 
theorem. It is easy to show: 


pe 


ipy'ay 


ABO = ify (x*a” a 5a") v (3.126) 


I 


(The Lagrangian term in the energy-momentum tensor can be dropped since the 
Dirac Lagrangian is zero if the equations of motion are obeyed.) 
The conserved angular momentum tensor is therefore: 


me” 


/ MO” dx 


fas ipt (x«a” — x’9H — so") v (3.127) 


The crucial difference between these equations and the Klein—Gordon case is that 
the angular momentum tensor contains an extra piece, proportional to o“”, which 
represents the fact that the theory has nontrivial spin 5 

It is then easy to complete this discussion by calculating how a quantized 
spinor field transforms under the Poincaré group: 


i[P, W] Ov 


i{M”’ yw] = (x"2" — x"9h — 50") v (3.128) 


With these operator identities, we can confirm that the quantized spinor field 
transforms as a spin 5 field under the Poincaré group: 


U(A, a)Wo(x)U~!(A, a) = S7'(A)op Wp(Ax +2) (3.129) 


AS we mentioned earlier, one of our fundamental assumptions about quantum 
field theory is that it must be causa]. Not surprisingly, the spin-statistics theorem 
is also intimately tied to the question of microcausality, that is, that no signals 
can propagate faster than the speed of light. From our field theory perspective, 
microcausality can be interpreted to mean that the commutator (anticommutator) 
of two boson (fermion) fields vanishes for spacelike separations: 


[o(x), @(y)] =0 for (x — y <0 
{y(x), W(y)} =0 for & — y <0 


(3.130) 
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To demonstrate the spin-statistics theorem, let us quantize bosons with anticom- 
mutators and arrive at a contradiction. Repeating earlier steps, we find, for large 
separations: 


d°k —ik-(x— ik-(x— 
Ol{O2), @O)}10) =f Gray (CRE Hee ™) 
expl_y | Rey ela) 
NN EEEEEETE TEES IET; EEEEITT TOG (3.131) 
Ix = y|2 = (x9 = y°)? 
which clearly violates our original assumption of microcausality. (Likewise, one 
can prove that fermions quantized with commutators violates microcausality. ) 

Historically, anticommutators and antimatter were introduced by Dirac, who 
was troubled that his theory seemed riddled with negative energy states. Since 
physical systems prefer the state of lowest energy, there is a finite probability 
that all the electrons in nature would decay into these negative energy states, 
thereby creating a catastrophe. To solve the problem of negative energy states, 
Dirac was led to postulate a radically new interpretation of the vacuum (which is 
consistent with Feynman’s interpretation, which we have chosen in this chapter). 
He postulated that the vacuum consisted of an infinite sea of filled negative energy 
states. Ordinary matter did not suddenly radiate an infinite amount of energy and 
cascade down to negative energy because the negative energy sea was completely 
filled. By the anticommutation relations, only one electron can occupy a negative 
energy State at a time; so an electron could not decay into the negative energy sea 
if it was already filled. In this way, electrons of positive energy could not cascade 
in energy down to negative energy states. 

However, once in a while an electron may be knocked out of the negative 
energy sea, creating a “hole.” This hole wouid act as if it were a particle. Dirac 
noticed that the absence of an electron of charge —|e| and negative energy —E is 
equivalent to the presence of a particle of positive charge +|e| and positive energy 
+E. This hole then had positive charge and the same mass as the electron. All 
particles therefore had positive energy: Both the original positive energy electron 
as well as the absence of a negative energy electron possessed positive charge E 
(Fig, 32). 

Dirac postulated that this hole would correspond to a new state of matter, 
an antielectron (although he initially considered the possibility that the hole was 
a proton). The vacuum was now elevated to an infinite storehouse of negative 
energy matter. 

Dirac’s hole theory meant that a new physical process was possible, pair 
production, where matter appeared out of the empty vacuum. Photons could 
knock an electron out of its negative energy sea, leaving us with an electron and 
its hole, that is, an electron and an antielectron. 
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Figure 3.2. Dirac’s “hole” picture. When a photon kicks an electron out of the infinite 
negative energy sea, it leaves a “hole” that behaves as if it had positive energy and positive 
charge (i.e., an anti-electron). This is pair production. 


At first, Dirac’s theory of an infinite sea of filled negative energy states was 
met with extreme skepticism. In the Handbuch der Physik, Pauli wrote: “Thus 
y ray photons (at least two in order to satisfy the laws of conservation of energy 
and momentum) must be able to transform, by themselves, into an electron and 
an antielectron. We do not believe, therefore, that this explanation can be se- 
riously considered.”” When Pauli’s discouraging article appeared, Anderson had 
already observed the antielectron in cloud chamber photographs, verifying Dirac’s 
conjecture. When confronted with the undeniable experimental verification of an- 
timatter, Pauli later revised his opinion of Dirac’s theory and made his famous 
remark, “... with his fine instinct for physical realities, he started his argument 
without knowing the end of it.” 

The interpretation that we have chosen in this chapter (that negative energy 
electrons going backwards in time are equivalent to positive energy antielectrons 
going forwards in time) is equivalent to Dirac’s infinite negative energy sea. In 
fact, when we subtracted off an infinite constant in the Hamiltonian, this can be 
interpreted as subtracting off the energy of Dirac’s infinite sea of negative energy 
states. 

To see how these positive and negative energy states move in time, let us 
define the evolution of a wave function via a source as follows: 


(iy"d, —m)W(x) = J(x) (3.132) 


To solve this equation, we introduce the Dirac propagator by: 


(iy, — m)Sp(x — y) = 64x — y) (3.133) 


Then the solution to the wave equation is given by: 


HG) = toe) : By Se =O) (3.134) 


where wp solves the homogeneous Dirac equation. 


92 Spin-0 and } Fields 


An explicit representation of the Dirac propagator can be obtained by using 
the Fourier transform: 


OD dey) Vee 
Sy -) eee ae 3.135 
anG)) / (nyt p2 —m? + ie ony 
which satisfies: 
Sr(x — y) = (iy* a, +m)Ar(& — y) (3.136) 


where Ay-(x — y) is the Klein—Gordon propagator. 

As before, we can solve for the propagator by integrating over k°. The 
integration is identical to the one found earlier for the Klein—Gordon equation, 
except now we have additional factors of # + m in the numerator. 

Integrating out the energy, we can write the Green’s function in terms of plane 
waves. The result is almost identical to the expression found for the Klein—Gordon 
propagator, except for the insertion of gamma matrices: 


Sr(x — x’) 


_f[ @pm =ip-a—x’) ’ 
—i f SEA e000 — 1+ 


+A_(p)e? O91’ — 1)] 
2 
= il d° p -i80 =) Vohe’) 
r= 
4 - 
+i0(t' —1t) > wgentses) (3.137) 
3 


where: 


Wp) = ' (27)? w, (pe! * (3.138) 


where e” = (1, 1, —1, —1) and wy = 41, w2 = U2, W3 = V1, and w4 = v2. Written 
in this fashion, the states with positive energy propagate forward in time, while 
the states with negative energy propagate backwards in time. 

As in the Klein-Gordon case, we can now replace the plane waves wy, with the 
quantized spinor field y(x) by taking the vacuum expectation value of the spinor 
fields: 


iSp(x — y)ap = (O\T Walx)Wp(y)|0) (3.139) 


which is one of the most important results of this section. We will use this 
expression repeatedly in our discussion of scattering matrices. 
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3.7 Weyl Neutrinos 


In the previous chapter, we saw that the spinorial representation of the Lorentz 
group is actually reducible if we introduce projection operators P, and Pr. Nor- 
mally, we are not concerned with this because this spinorial representation is 
irreducible under the full Poincaré group for massive states. 

However, there is a situation when the spinorial representation becomes re- 
ducible even under the Poincaré group, and that is when the fermion is massless. 
For example, we can take an imaginary representation of the y matrices, which 
gives us Majorana spinors’ (see Appendix). For our purposes, what is more 
interesting is taking the Wey/ representation,'® which gives us a representation of 
neutrinos. 

If we take the representation: 


me oe eo et) ee oe 
aT oe)” ee 6) a eee 


then in this representation, we can write down two chiral operators: 
1+ I 0 
Pp oe 
De 0 0 


ace ae 0 0 
| (3 I 


To see how these projection operators affect the electron field, let us split w as 


follows: 
Wr 
= 3.141 
y ( i ( ) 


Then we have: 


See 0 
re te) 


Because P;, and Pr commute with the Lorentz generators: 


[Pr.r,o*"]=0 (3.142) 
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it means that the four-component spinor y actually spans a reducible represen- 
tation of the Lorentz group. Contained within the complex Dirac representation 
are two distinct chiral representations of the Lorentz group. Although y, and 
Wr separately form an irreducible representation of the Lorentz group, they do 
not form a representation of the Poincaré group for massive particles. To find an 
irreducible representation of the Poincaré group, we must impose m = 0. 

The reason that these chiral fermions must be massless is because the mass 
term my in the Lagrangian is not invariant under these two separate Lorentz 
transformations. Because: 


wv = Vive t+ verve (3.143) 


mass terms in the action necessarily mix these two distinct representations of the 
Lorentz group. Thus, this representation forces us to have massless fermions; that 
is, this is a theory of massless neutrinos. 

The theory of massless neutrinos is therefore invariant under the following 
chiral transformation: 


Vela 


vb — weirs (3.144) 


This symmetry is violated by mass terms. In other words, the spinor representa- 
tion of the Poincaré group (for zero-mass particles) is reducible, and we have the 
freedom to choose two-component rather than four-component spinor representa- 
tions. 

This will have important phenomenological implications later on when we 
consider the quark model in the limit of small quark masses. Then we can use 
the power of chiral symmetry in this limit to extract a large number of nontrivial 
relations among S matrix elements, called sum rules. In addition, the actual values 
of the masses of the quarks then give us a handle as to size of the violation of 
these chiral sum rules. 

In summary, canonical quantization gives a rigorous formulation of a second 
quantized field theory capable of describing multiparticle states. There are other, 
more elegant quantization programs, but the canonical one is perhaps the most 
rigorous. We also saw that the Dirac equation emerged from a spinorial rep- 
resentation of the Lorentz group, which was developed in the previous chapter. 
One of the successes of the second quantized approach is that we have a physical 
interpretation for the negative energy states that inevitably occur in any relativistic 
formulation. In the next chapter, we will quantize a spin-one field and couple it 
to the Dirac electron theory, giving us quantum electrodynamics. 
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3.8 Exercises 


1. Prove that the anticommutation relation of the Dirac spinors in Eq. (3.119) is 
satisfied if the harmonic oscillator states obey the anticommutation relation 
in Eq. (3.120). 


2. Prove that: 


Tr(@, d2++: don) = +2 Tr(d3--+ don) — a - a3 Tr (do da--- don) 
+-++ +t + oy Tr (do°*> don-1) (3.145) 


3. Consider the 16 matrices 4, where A = S,T, V, P, A. Show that ([4)* = 
+1. Show that each is traceless except for the scalar. Given 4 and I'g 
(A + B), show that there exists '¢ (not equal to unity) such that: 


ies = Te (3.146) 


4. By inserting the Fourier decomposition of the fields in Eq. (3.64), prove 
explicitly that: 


I 
> 
a 
& 
| 
SS 
— 


[P(x), OY) 


d*x 2 2 —ik(x—y) 
nyo — m*)e(ko)e (3.147) 


Then perform the integration over k, leaving us with a Bessel function. Then 
show that the commutator vanishes outside the light cone, thereby establishing 
the causality of the system. 


5. Prove Eq. (3.22) by explicitly performing the commutation relations. 
6. Prove that Eqs. (3.108) and (3.109) are obeyed by explicit computation. 


7. Prove the following formula, due to Fierz: 


co neCoys= >) Sa ses(a)yp (3.148) 


D=S,V,T,A,P B=S), Valin Aye 
where: 
es 1 4 12 -4 1 Cs 
ey ; 1-2 O -2 -1 cy 
alae A 00-2 0 5 cr (3.149) 
a -1 -2 0-2 1 CA 
cp 1-4 12 4 #1 cp 
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(Hint: use the fact that the [4 matrices form a complete set of 16 matrices. 
Then treat the above expression as a matrix equation in order to power expand 
it in terms of the I’,.) 


. Use the Fierz identity to re-express (Wx AW2)(W3B wa) in terms of Dirac bi- 


linears (WC w4)(3 D2). Express the matrices C and D in terms of the 
matrices A and B. 


. The appearance of y° within wy = wly°w seems to violate Lorentz in- 


variance, since y® is manifestly non-invariant and transforms as the zeroth 
component of a vector. So why is yw still a Lorentz invariant? Furthermore, 
wow clearly forms a finite dimensional representation of the Lorentz group. 
But this seems to be a contradiction of our no-go theorem. Is this so? If not, 
then why not? 


Let u;(p) be spinors which satisfy the Dirac equation. Then prove the Gordon 
formula: 


Ue 22 io” 
i(p2)y"u(p1) = u(p2) (2 + “it u(pi) _ (3.150) 
m 2m 


where q,, is the momentum transfer. 
We define brackets to mean summing over all antisymmetric combinations of 


indices (see (A.12)): 


1 
plier N= ae svg yen (Gal51) 


In an arbitrary number of space-time dimensions, prove that: 
N 
yh ybibe EN = yy Meio HN Cee ae (3.152) 


i=] 


where /i; means that the jz; index is to be deleted in the sum. Prove: 


ae SGD eer Be ne (3.153) 


where g#¥P = gli gla _ gio gu, 


12. Based on the previous problem, derive a formula for: 


Ye UN a Hae (3.154) 
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Prove it by induction. 


Prove, by direct computation, that the Hamiltonian and the charge can be 
written in terms of Dirac harmonic oscillators as in Eqs. (3.118) and (3.124). 


The Feynman propagator in x space can actually be computed analytically. 
Set m = 0, use radial coordinates, and show that the Feynman propagator in 
xX space can be written in terms of Bessel functions. 


Consider a unitary transformation U, with H’ = UHU' and w’ = Uy, which 
changes the Dirac equation to: 


I Sie (7155) 


Let U be given by: 


m+\|E| pa-p 
C= — — 3.156 
¥ 21E| ” Joel + ]ED ee 


Show that U removes all coupling between the positive and negative energy 
parts of the Dirac equation. This is an example of the Foldy-Wouthuysen 
transformation. 


Prove that the asymptotic behavior of the anticommutator appearing in Eq. 
(3.131) is correct. Repeat the same calculation to show that spinors quantized 
with commutators also violate the spin-statistics theorem. 


Prove that the derivative of a theta function gives us a Dirac delta function: 
0 ' / 
aot —?t')=d(-1) (35157) 
From this, prove that: 
(82 + mT $(x)G(x’) = —i 8*(x — x’) (3.158) 


Show that this equation is compatible with the expression for Ay in terms of 
the time-ordered product of two scalar fields. 


The Feynman propagator A(x — y) can be expressed as the vacuum ex- 
pectation value of the time-ordered product of two fields. The time-ordering 
operator T appears to violate Lorentz invariance, since x° is singled out. Why 
is the expression still Lorentz invariant? 


Chapter 4 
Quantum Electrodynamics 


It was found that this equation gave the particle a spin of half a quantum. 
And also gave it a magnetic moment. It gave us the properties that one 
needed for an electron. That was really an unexpected bonus for me, 


completely unexpected. 
—P.A.M. Dirac 


4.1 Maxwell’s Equations 


Now that we have successfully quantized the free Dirac electron, we would like 
to discuss the question of coupling the Dirac electron to a spin-one Maxwell 
field A,. The resulting theory will be called quantum electrodynamics, which 
is perhaps the most successful physical theory ever proposed. After several 
decades of confusion, false starts, and frustration, QED has emerged as one of the 
cornerstones of the quantum theory. 

Our discussion of the massless, spin-one field begins with the classical equa- 
tions of Maxwell: 


divE = op 
dE 2 
curl B — cyan ij 
diwBe = 0 
curl KE + . = 0 (4.1) 


The source, in turn, obeys a conservation equation: 
a one 
— +divj=0 (4.2) 


Because the divergence of a curl is equal to zero, and because the curl of 
the gradient is equal to zero, we can replace the magnetic and electric field with 
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potentials A° and A as follows: 
E=-VA°-——;: Be=curlA (4.3) 


The fact that these equations did not transform according to the standard 
Galilean transformation led to the discovery of special relativity. To see this 
relativistic invariance more clearly, let us define: 

(A°, A) 
(p, j) (4.4) 


At 


He 
Then the current conservation equation can be written: 
dnj” =0 (4.5) 


and Maxwell’s equations can be summarized as: 


Gly hee hg (4.6) 
where: 
Fuy =0,Ay — dA, (4.7) 
or: 
0 ~&' —E? —EF 
elo Pe as 
E> —B? B! 0 
and: 


Foe! = — §!: Fc —eiik Bk (4.9) 
We can now derive Maxwell’s equations by writing down the following action: 


1 
B= FFF = 5 — B’) (4.10) 
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If we insert this into the Euler-Lagrange equations of motion, we find that the 
equations of motion are given by: 


0, FY” =0 (4.11) 


which is just the classical Maxwell’s equation with zero source. 

A key consequence of this construction is that the Maxwell theory is invariant 
under a /ocal symmetry, that is, one whose parameters are dependent on space-— 
time: 


5A, = 9, A(x) (4.12) 


(A transformation whose parameters are constants is called a global transforma- 
tion, like the isospin and Lorentz transformations discussed earlier.) If we apply 
successive gauge transformations, we find that they form a group with the simple 
addition law: 


3 =A, +A, (4.13) 


This is the same group law we found for U(1); so we see that Maxwell’s equations 
are locally invariant under U(1). 
Under this transformation, the Maxwell tensor is an invariant: 


Py ge (4.14) 


so the Lagrangian is also invariant. 

This also means that there is a large redundancy associated with the theory. 
The equations for A,, are identical to the equations for A), = A, + d,A. 

We also note that the naive energy-momentum tensor associated with 
Maxwell’s theory has the wrong properties. It is neither symmetric, nor is it 
gauge invariant. A naive application of Noether’s theorem gives us an energy— 
momentum tensor that equals: 


1 
TY’ = — FH OYA, + qe trek (4.15) 


which is not symmetric. This means that there is no conserved angular momentum 
tensor. Worse, it is not even gauge invariant, since it is not written entirely in 


terms of the Maxwell tensor F,,,. 
However, since the energy-momentum tensor is not a directly measurable 


quantity, we are free to add another tensor to it: 


TES Toe A’) (4.16) 
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The resulting energy-momentum tensor is conserved, symmetric, and gauge in- 
variant: 


1 
TY SPP st 8 Foo F™ (4.17) 
It is important to note that the addition of this extra term does not affect the 


integrated charges, which are directly measurable. To see what conserved charges 
are associated with this energy-momentum tensor, we find: 


T = 5B? +B?) 
7° = (EcR) (4.18) 


which we recognize as the energy density and the Poynting vector. Thus, this new 
energy—momentum tensor is a physically acceptable quantity and compatible with 
gauge invariance. 

Gauge invariance is thus a guide to calculating the physical properties of field 
theory. As we shall see throughout this book, it is also absolutely important to 
maintain gauge invariance for QED, for several reasons: 


1. The proof of renormalization, that we can extract a finite S$ matrix order by 
order from the quantum theory, is crucially dependent on gauge invariance. 


2. The proof of unitarity, that there are no ghost states with negative norm in the 
theory, also depends on gauge invariance. (The longitudinal vibration modes 
of the Maxwell field are negative norm states, which can be eliminated by 
choosing a gauge, such as the Coulomb gauge.) 


3. The proof that the theory is Lorentz invariant after we have fixed the gauge 
in a nonrelativistic fashion requires the use of gauge symmetry. 


4.2 Relativistic Quantum Mechanics 


The problem facing us now is to write down the action for the Dirac theory coupled 
to the Maxwell theory, creating the quantum theory of electrodynamics. The most 
convenient way is to use the electron current as the source for the Maxwell field. 
The electron current is given by wy“y, and hence we propose the coupling: 


eAuwyiy (4.19) 
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We saw earlier that this current emerged because of the invariance of the Dirac 
action under the symmetry y — exp(/A)w. Let us promote A into a local gauge 
parameter, so that A is a function of space-time. We want an action invariant 
under: 


pe) AONE) 
Ap — Ap — d,A(x) (4.20) 
The problem with this transformation is that A(x) is a function of space-time; 
therefore, the derivative of a spinor d,,y is not a covariant object. It picks up an 


extraneous term 0,, A(x) in its transformation. To eliminate this extraneous term, 
we introduce the covariant derivative: 


3, > Deerieds (4.21) 


The advantage of introducing the covariant derivative is that it transforms covari- 
antly under a gauge transformation: 


Div > e&AOD w+ (ied, A — ied, A) 
— & Dy (4.22) 


This means that the following action is gauge invariant: 
¥ —=,), 1 v 
Z=wiy* D, —m)yp - Z | arin sae (4.23) 


which we obtain by simply replacing 0, by D,. 

The coupling of the electron to the Maxwell field reproduces the coupling 
proposed earlier. In the limit of velocities small compared to the speed of light, 
the Dirac equation should reduce to a modified version of the Schrédinger equation. 
We are hence interested in checking the correctness of the Dirac equation to lowest 
order, to see if we can reproduce the nonrelativistic results and corrections to them. 
The Dirac equation of motion, in the presence of an electromagnetic potential, 
now becomes: 


ix = [a-(-iV —eA)+Bm+eA|y (4.24) 


To find solutions of this equation, let us decompose this four-spinor into two 


smaller two-spinors: 
e IM ® 
y= ? = sae (4.25) 
x e WV 
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Then the Dirac equation can be decomposed as the sum of two two-spinor equa- 


tions: 


ix = o-mx+eA°p+md 
jx = o-1h+eA°x —mx (4.26) 


where 2 = p — eA. Next, we will eliminate x. For small fields, we can make the 
approximation that eA° < 2m. Then we can solve the second equation as: 


CO: 
| taut Fe Ae (4.27) 


In this approximation, the Dirac equation can be expressed as a Schrédinger-like 
equation, but with crucial spin-dependent corrections: 


eo 
a = |‘ o +ea"| 
ot 2m 
foe 2, 
ae ee: (4.28) 
2m 2m 


where we have used the fact that : 
(o- nm’ = —co-B (4.29) 


The previous equation gives the first corrections to the Schrédinger equation in 
the presence of a magnetic and electric field. Classically, we know that the energy 
of a magnetic dipole in a magnetic field is given by the dot product of the magnetic 
moment with the field: 


B= heh (4.30) 
Since S =ho/2, the magnetic moment of the electron is therefore: 


eho e 
p=— =2(-—)s (4.31) 
Thus, the Dirac theory predicted that the electron should have a magnetic moment 
twice what one might normally expect, that is, twice the Bohr magneton. This was 
perhaps the first major success of the Dirac theory of the electron. Historically, it 
gave confidence to physicists that the Dirac theory was correct, even if it seemed 
to have problems with negative energy states. 
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Yet another classic result that gave credibility to this relativistic formulation 
was the splitting of the spectral lines of the hydrogen atom. To solve for corrections 
to the Schrédinger atom, we set B to zero, and take A° to be the Coulomb potential. 
The Dirac equation is now written as: 


Ey = (-ia V+ pm - 22) y = Hw (4.32) 
where: a = e*/4n 
and: 
m— Za oO: p 
H= oe (4.33) 
op -m—= 


In hindsight, it turns out to be convenient to introduce a judicious ansatz for 
the solution of the Dirac equation. First, as in the Schrédinger case, we want to 
separate variables, so that we take yw ~ f(r)Y(@, $), where the radial function is 
explicitly separated off. Second, because V? contains the Casimir operator L? in 
the usual Schrédinger formalism, we choose Y(@, @) to be the standard spherical 
harmonics, that is, eigenfunctions of the angular momentum operators. For the 
spinning case, however, what appears is: 


J=L+S=L+6/2 (4.34) 


that is, we have a combination of orbital spin L and intrinsic spin S. Thus, our 
eigenfunctions for the Dirac case must be labeled by eigenvalues j,/, m. 
Based on these arguments, we choose as our ansatz: 


i[Gy(r)/r]6\, (4.35) 


jm 
[Fij(r)/7) (co - r) ¢’ 

Inserting this ansatz into Eq. (4.32) and factoring out the angular part, we find 
that the radial part of our eigenfunctions obey: 


dk 1\ F(r 
(C—O oe Galen 
digi iy Gy 
(z +m+ =*) BO = — ¢ # ;) —— (4.36) 


where we use the +(—) sign for j = /+1/2(j =/—1/2). By power expanding this 
equation in r, this series of equations can be solved in terms of hypergeometric 
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functions. These equations, when power expanded, give us the energy eigenstates 
of the hydrogen atom! ?: 


—1/2 


a 
me, | ———— (4.37) 
n—(j+1/2)+ JU + 1/2)? — Z*a? 


Experimentally, this formula correctly gave spin-dependent corrections to the 
Bohr formula to lowest order. In contrast to the usual Bohr formula, the energy is 
now a function of both the principal quantum number n as well as the total spin 
j. By power expanding, we can recover the usual nonrelativistic result to lowest 
order: 


Dee Dae 
By =m[1- Fo - Ee (4, -3/4) + | (4.38) 


So far, we have only considered the Dirac electron interacting with a classical 
Coulomb potential. Therefore, it is not surprising that this formula neglects smaller 
quantum corrections in the hydrogen energy levels. In particular, the 2S)/2 and 
2 P/2 levels have the same n and j values, so they should be degenerate according 
to the Dirac formula. However, experimentally these two levels are found to be 
split by a small amount, called the Lamb shift. 

It was not until 1949, with the correct formulation of QED, that one could 
successfully calculate these small quantum corrections to the Dirac energy lev- 
els. The calculation of these quantum corrections in QED was one of the finest 
achievements of quantum field theory. 


4.3 Quantizing the Maxwell Field 


Because of gauge invariance, there are also complications when we quantize the 
theory. A naive quantization of the Maxwell theory fails for a simple reason: The 
propagator does not exist. To see this, let us write down the action in the following 
form: 

1 


Z= ae PisacA® (4.39) 


where: 


Puv = Suv — By,» /(8)? (4.40) 


The problem with this operator is that it is not invertible, and hence we cannot 
construct a propagator for the theory. In fact, this is typical of any gauge theory, 
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not just Maxwell’s theory. This also occurs in general relativity and in superstring 
theory. The origin of the noninvertibility of this operator is because P,,, is a 
projection operator, that is, its square is equal to itself: 


ear” =e (4.41) 
and it projects out longitudinal states: 
(led Eyre 0) (4.42) 


The fact that P,,,, is a projection operator, of course, goes to the heart of why 
Maxwell’s theory is a gauge theory. This projection operator projects out any 
states with the form 0,,A, which is just the statement of gauge invariance. 

The solution to this problem is that we must break this invariance by choosing 
a gauge. Because we have the freedom to add 0, A to A,, we will choose a 
specific value of A which will break gauge invariance. We are guaranteed this 
degree of freedom if the variation 6A, = 0, A can be inverted. There are several 
ways in which we can fix the gauge and remove this infinite redundancy. We can, 
for example, place constraints directly on the gauge field A,,, or we may add the 
following term to the action: 


1 


a (3,A")” (4.43) 


where a@ is arbitrary. We list some common gauges: 


Coulomb gauge: V;A; =0 
Axial gauge : 


Temporal gauge : 


Landau gauge : (4.44) 


Landau gauge : 


Feynman gauge : 


Unitary gauge : 


(Notice that there are two equivalent ways to represent the Landau gauge.) 
Each time we fix the constraint by restricting the gauge field A,,, we must check 
that there exists a choice of A such that this gauge condition is possible. For 
example, if we set A3 = 0, then we must show that: 


A, =0=A3+93A (4.45) 
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so that: 


= / dx A3 (4.46) 


In the Coulomb gauge, one extracts out the longitudinal modes of the field 
from the very start. To show that the gauge degree of freedom allows us to make 
this choice, we write down: 


V -A’=V-(A+VA)=0 (4.47) 
Solving for A, we find: 
1 ax" ; } 
2--3V-A== | =v a 4.48 
A ava lwece’ ee) (4.48) 


Likewise, the Landau gauge choice means that we can find a A such that: 


A= — =, A" (4.49) 


To begin the process of canonical quantization, we will take the Coulomb 
gauge in which only the physical states are allowed to propagate. Let us first 
calculate the canonical conjugate to the various fields. Since Ao does not occur in 
the Lagrangian, this means that Ag does not appear to propagate, which is a sign 
that there are redundant modes in the action. 

The other modes, however, have canonical conjugates: 


5 ee 
vis = = — 
5Ao 
; OS Bef . 
moo = ——=—A' —0;Ao= E' 4.50 
3A, 0 (4.50) 


We write the Lagrangian as: 


1 — i 1 
Ga=-F sar 
Oh = 4 


2 
qiw =e = F2 (4.51) 


We now introduce the independent £; field via a trick. We rewrite the action as: 


1 1 
B =—=E} — E; Fy — —F3 (4.52) 
2 4 
By eliminating E£; by its equation of motion, we find E; = —Fo;. By inserting this 


value back into the Lagrangian, we find the original Lagrangian back again. 
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Then we can write the Lagrangian for QED as: 


2 HA — Agy~ E- 5B iF} +d D- m)y 


2 


Be Ne Ne 


(E* +B?) + “%(w, A) — Ao (VE — evry) (4.53) 


Nl — 


Ao is a Lagrange multiplier. If we solve for the equation of motion of this field, 
we find that there is the additional constraint: 


Gauss’s Law: V-E=p=ewyw (4.54) 


Thus, Gauss’s Law emerges only after solving for the equation of motion of Ao. 
If we now count the independent degrees of freedom, we find we have only two 
degrees left, which correspond to the two independent transverse helicity states. 
Of the original four components of A,,, we see that Ao can be eliminated by its 
equation of motion, and that we can gauge away the longitudinal mode, therefore 
leaving us with 4 — 2 degrees of freedom, which is precisely the two helicity states 
predicted in Section 2.8 from group-theoretical arguments alone for massless 
representations of the Poincaré group. 

Intuitively, this means that a photon moving in the z direction can vibrate 
in the x and y direction, but not the z direction or the timelike direction. This 
corresponds to the intuitive understanding of transverse photons. In the Coulomb 
gauge, we can reduce all fields to their transverse components by eliminating their 
longitudinal components. Let us separate out the transverse and longitudinal parts 
as follows: E = E; + E,, where V -E; = 0 and V - E; = p. Let us now solve 
for E; in terms of p. Then we have: 


als p (4.55) 


Bp =Ve 


If we insert this back into the Lagrangian, then we find that all longitudinal 
contributions cancel, leaving only the transverse parts, except for the piece: 


1 es 
pEL = 5 Pye 
2 t t 
. = fee, oes) aes 
8x Ix—y| 


This last term is called the “instantaneous four-fermion Coulomb term,” which 
seems to violate special relativity since this interaction travels instantly across 
space. However, we shall show at the end of this section that this term precisely 
cancels against another term in the propagator, so Lorentz symmetry is restored. 
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If we impose canonical commutation relations, we find a further complication. 
Naively, we might want to impose: 


[A,(x, t), 7 /(y, t)] = —i8;;8°(x — y) (4.57) 


However, this cannot be correct because we can take the divergence of both sides 
of the equation. The divergence of A; is zero, so the left-hand side is zero, but the 
right-hand side is not. As a result, we must modify the canonical commutation 
relations as follows: 


[Ai(x, t), 1 /(y, t)] = —i5;;(x — y) (4.58) 


where the right-hand side must be transverse; that is: 


: | ae kiky 
i (3, - ah (4.59) 


As before, our next job is to decompose the Maxwell field in terms of its Fourier 
modes, and then show that they satisfy the commutation relations. However, we 
must be careful to maintain the transversality condition, which imposes a constraint 
on the polarization vector. The decomposition is given by: 


[a (Dew ** +.a* "(ei *] (4.60) 


= k 
ad \yemy°° 


In order to preserve the condition that A is transverse, we take the divergence of 
this equation and set it to zero. This means that we must impose: 
e*-k = 0 


€*(k)- €* (k) oe (4.61) 


(The simplest way of satisfying these transversality conditions is to take the 
momentum along the z direction, and keep the polarization vector totally in the 
transverse directions, i.e., in the x and y directions. However, we will keep our 
discussion as general as possible.) By inverting these relations, we can solve for 
the Fourier moments in terms of the fields: 


d3x oP 
x k =  { ik-x 9 r k aM 
a*(k) i in e o €°(k)- A(x) 
arg) = / i ewikx 90 é*(k)- A(x) (4.62) 
/ (213 2k 
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In order to satisfy the canonical commutation relations among the fields, we must 
impose the following commutation relations among the Fourier moments: 


[a*(k), at (k’)] = 8° 8k — k’) (4.63) 


(An essential point is that the sign of the commutation relations gives us positive 
norm states. There are no negative norm states, or ghosts, in this construction 
in the Coulomb gauge.) Let us now insert this Fourier decomposition into the 
expression for the energy and momentum: 


a 5 f as (E? + B?) 
2 
= ye i d°k wla(k)a*(k)] 
A=! 
nes fexcEexss 
2 
= / d>k k) >a" (k)a*(k) (4.64) 
A=1 


After normal ordering, once again the energy is positive definite. Finally, we wish 
to calculate the propagator for the theory. Again, there is a complication because 
the field is transverse. The simplest way to construct the propagator is to write 
down the time-ordered vacuum expectation value of two fields. The calculation 
is almost identical to the one for scalar and spinor fields, except we have the 
polarization tensor to insert: 


ex), = UWlPAgQ@yA,a ),0) 


d*k e—ik- (x—x’) 
Qny Re +ie~ 


Ye (k)e(k) (4.65) 


The previous expression is not Lorentz invariant since we are dealing with trans- 
verse states. This is a bit troubling, until we realize that Green’s functions are 
off-shell objects and are not measurable. However, these Lorentz violating terms 
should vanish in the full S matrix. To see this explicitly, let us choose a new 
orthogonal basis of four vectors, given by €,(k), e7(k), k*, and a new vector 
n’ = (1,0,0,0). Any tensor can be power expanded in terms of this new ba- 
sis. Therefore, the sum over polarization vectors appearing in the propagator can 
always be expanded in terms of the tensors guy, MuNv, Kuky, and ky ny. We can 
calculate the coefficients of this expansion by demanding that both sides of the 
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equation be transverse. Then it is easy to show: 


s Xr k x k es kuky 
as ie Jes ( ) aa Suv (k : ny = 2 
(k - ny(kunv ar kKyny) num 
See _ Sa ae 
knp-e ape 8 


Fortunately, the noninvariant terms involving 7 can all be dropped. The terms 
proportional to k,, vanish when inserted into a scattering amplitude. This is 
because the propagator couples to two currents, which in turn are conserved by 
gauge invariance. (To see this, notice that the theory is invariant under 6A,, = 0, A. 
In a scattering amplitude, this means that adding k,, to the polarization vector €,, 
cannot change the amplitude. Thus, k,, terms in the propagator do not couple to 
the rest of the diagram. This will be discussed more in detail when we study the 
Ward identities.) 
If we drop terms proportional to k,, in the propagator, we are left with: 


6(t —t’) 


ay (4.67) 


D(x — x )uy = —Bpv r(x — x'3m = 0) — nuny<———— 


The first term is what we want since it is covariant. The second term proportional 
to nun is called the instantaneous Coulomb term. In any Feynman diagram, 
it occurs between two currents, creating (W'w)V~2(wty). This term precisely 
cancels the Coulomb term found in the Hamiltonian when we solved for E, in 
Eq. (4.56). 

As expected, we therefore find that although the Green’s function is gauge 
dependent (and possesses terms that travel instantly across space), the S$ matrix is 
Lorentz invariant and causal. 


4.4 Gupta-Bleuler Quantization 


The advantage of the canonical quantization method in the Coulomb gauge is that 
we always work with transverse states. Thus, all states have positive norm: 


(Ola*(k)a* t(k)|0) = 8,383 — k’) : (4.68) 


However, the canonical quantization method, although it is guaranteed to yield 
a unitary theory, is cumbersome because Lorentz invariance is explicitly broken. 
For higher spin theories, the loss of Lorentz invariance multiplies the difficulty of 
any calculation by several times. 
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There is another method of quantization, called the Gupta—Bleuler?* quanti- 
zation method or covariant method, which keeps manifest Lorentz invariance and 
simplifies any calculation. There is, however, a price that must be paid, and that 
is the theory allows negative norm states, or ghosts, to propagate. The resulting 
theory is manifestly Lorentz invariant with the presence of these ghosts, but the 
theory is still self-consistent because we remove these ghost states by hand from 
the physical states of the theory. We begin by explicitly breaking gauge invariance 
by adding a noninvariant term into the action: 


F 1 
Ae = Oe et (4.69) 


for arbitrary a. 
Then the action now reads: 


nee 
Se Pac AY (4.70) 


where: 


Puy = Suv — (1 — a7!) 8,8, /87 (4.71) 


Now that we have explicitly broken the gauge invariance, this operator is no longer 
a projection operator and hence can be inverted to find the propagator: 


Dyy = (PO) /8? = — [guv — 1. — @) 8,.8,/8"] /3? (4.72) 


This propagator explicitly propagates ghost states that violate unitarity. The Doo 
component of the propagator occurs with the wrong sign, and hence represents a 
ghost state. For our purposes, we will take the gauge a = I, so that the equation 
of motion now reads: 


oA 0 (4.73) 


In this gauge, we find that Ao is no longer a redundant Lagrange multiplier, but a 
dynamical field and hence has a canonical conjugate to it. The conjugate field of 
A,, is now a four-vector: 


ae 
aire (4.74) 


Then the covariant canonical commutation relations read: 


[Ay (x), 7°(x')] = 1628°(x — x’) (4.75) 
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As usual, we can decompose the field in terms of the Fourier moments: 
d°k a a —ik-x fA A ik-x 
An@= | 3. [a*(kek(kje** + al (keh (ke * | (4.76) 
k 


The difference now is that the €,, vector is a genuine four-vector. In order for 
the canonical commutation relations to be satisfied, we necessarily choose the 
following commutation relations among the oscillators: 


[a*(k), al’ (k’)] = —9*' 83k — k’) (4.77) 


The presence of the metric tensor in the commutation relation signals that the 
norm of the states may be negative; that is, a nonphysical, negative norm ghost is 
present in the theory. The norm of the state a!*(k)|O) can now be negative. This 
is the price we pay for having a Lorentz covariant quantization scheme. 

It is straightforward to prove that the propagator is now given by: 


(O|T A,.(x)Ay(y)|0) = —iguvAr@ — y) (4.78) 


This can be proved by explicitly inserting the operator expression for A,,(x). 
The important aspect of this propagator is that it contains the metric g,,,, which 
alternates in sign. Hence, it propagates ghost states. Since ghosts now propagate 
in the theory, we must be careful how we remove them. If we take the condition 
d,,A“|V) = 0, we find that this condition is too stringent; it has no solutions at all. 
The Gupta—Bleuler formalism is based on the observation that a weaker condition 
is required: 


(3, A“) |W) =0 (4.79) 


where we only allow the destruction part of the constraint to act on physical states. 
In momentum space, this is equivalent to the condition that k“a,,(k)|W) = 0. This 
guarantees that, although ghosts are allowed to circulate in the system, they are 
explicitly removed from all physical states of the theory. 

(We can also quantize the massive vector field in much the same way. The 
quantization is almost identical to the one presented before, but now the counting 
of physical states is different. We recall from our discussion of the Poincaré group 
that a massless field only has two helicity components. However, the massive 
vector field has 3 components.) 
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4.5 C, P,and T Invariance 


Although we have analyzed the behavior of quantum fields under continuous 
isospin and Lorentz symmetry, we must also investigate their behavior under 
discrete symmetries, such as that generated by parity, charge conjugation, and 
time-reversal symmetry: 


IP x— —x 
e—-e 
ey: t—-t (4.80) 


Classically, we know that the laws of physics are invariant under these transfor- 
mations. For example, we find that both Newton’s and Maxwell’s equations are 
invariant under these transformations. 

The easiest way in which to calculate how the A,, field transforms is to examine 
Maxwell’s equation, especially the source term j“. Under a parity transformation, 
the electric charge distribution p does not change, but j —> —j because we are 
reversing the direction of the electric current. Thus: 


CN hale dame (4.81) 


Under a charge conjugation, a positive electric current turns into a negative one, 
so that: 


eye = —j" (4.82) 


Then under a time reversal, once again p remains the same, but j reverses sign 
(because the current reverses direction), so: 


Tj ge) =) (4.83) 


Then the transformation of A, can be found immediately. One way is to ob- 
serve that j“A, appears in the action and is an invariant. Then we can read 
off the transformation properties of A,. Another way in which to derive the 
transformation properties of A, is to use Maxwell’s equations, 0, F"’(A) = j”. 
Knowing the transformation properties of 0, and j”, then it is easy to solve for 
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the transformation properties of A,,. We summarize our results as follows: 


(4.84) 


Similarly, we can read off the transformation properties of y, using the Dirac 
equation and also the fact that j“ = ewy“vw, although the calculation is consid- 
erably longer. With a bit of work, we can summarize how Dirac bilinear scalars, 
vectors, axial vectors, etc. transform under these discrete transformations, includ- 
ing CPT =0: 


(4.85) 


@ FS] hy & 


To prove this, let us examine each transformation separately. 


4.5.1 Parity 


When the Dirac field couples to the Maxwell field, we want the combination j“ A, 
appearing in the action to be parity conserving. 

To find an explicit form for the operator 7 for the Dirac field, we will find it 
convenient to recall that any element of O(3, 1) (which includes parity operations 
with det O = —1) has the following effect on the Dirac matrix: 


S(A)~"y* SCA) = Aby” (4.86) 
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For our case, we can set the matrix A to be our parity operator if we specify that 
it reverses the sign of x: 


1. 09g 0 
Ol 0 

M1) O48 (4.87) 
0 0 0 -1 


which is equal to the metric tensor g,,,. Thus, we want a solution of the following 
equation: 


PNYP =p y” =—(-1)*y" (4.88) 
An explicit solution of this equation is simply given by: 
een! (4.89) 


where e’® is an irrelevant phase factor. 
Thus, the action of the parity transformation on a spinor field is given by: 


Parity: w’(—x,t) = S(A)v =e? y wx, 2) (4.90) 

This also means that the transformation of the w field is given by: 
v'(-x, 1) = Wx, Ny°e"? (4.91) 
From this, we can calculate how the various bilinear combinations transform under 


the parity operation in Eq. (4.85). 


4.5.2 Charge Conjugation 


Charge conjugation is easily studied by taking the Dirac equation and then revers- 
ing the sign of the electric charge. If we let yw, represent the Dirac field that has 
the opposite charge as w, then we have: 


@ gJ—eX-—m)yp 0 


(i JteA—m)v- 0 (4.92) 


In order to find the relationship between w with charge e and y, with charge 
—e, let us take the complex conjugate and then the transpose of the first equation. 
Then we find: 


y"" (—ia, — eA,) (y°’ w*) =0 (4.93) 
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It can be shown that for any representation of the Dirac algebra, there exists a 
matrix C that satisfies: 


Cyc. =-y, (4.94) 


Now let us compare the previous equation with the equation for the yy, field. We 
have an exact correspondence (up to a phase) if we set: 


be = PCy yp") =e CHT (4.95) 


So far, we have not specified the representation of the Dirac matrices. There 
is more than one solution of the equation for C. In the Dirac representation, we 
find the following solution for the C matrix as: 


a 0 —-io? 
Cy = Pa 0 (4.96) 
SEO 


which satisfies the following additional constraints: 
C= C= = 6. = C! (4.97) 


For us, however, the important feature of the C matrix is that it allows us to 
identify the particle—antiparticle structure of the Dirac field. 
Applying the C matrix to the particle field, we obtain the antiparticle field: 


y= > vh=elé 


0 
0 
0 (4.98) 
1 


1 
0 
0 
0 


which justifies our earlier statement that the Dirac equation contains both the 
particle and antiparticle fields, with the charges as well as spins reversed. 

Now let us compute how the current j“ transforms under the charge conjuga- 
tion operation: 


ita Vey haw Cyc’ =v vib! =-dyw (4.99) 


The last minus sign is important: Because wy is an anticommuting field, we pick 
up an extra minus sign when we move one spinor past another. 

Now let us try to determine how the combined Dirac and Maxwell system 
transforms under charge conjugation. The Dirac equation is left invariant if we 
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make the simultaneous change: 


Ay = Ay (4.100) 


(We have changed the sign of the Maxwell field, which simulates the change of 
the charge e that we made earlier.) 
Thus, the combination j,, A“ is invariant under charge conjugation, as desired. 


4.5.3 Time Reversal 


Finally, we analyze the effect of making the transformation t — —t. We wish to 
find an explicit representation of the operator .7 in terms of harmonic oscillators. 
This can be done in several ways. We can represent the time-reversal operator 
as S(A) acting on a spinor where A,,, = —g, . Or, we can write down the Dirac 
equation with a time reversal and try to retransform the equation back into the 
usual Dirac form. 

Either way, we find the same result: 


Tux, thT | = eT w(x, —2) (4.101) 
where: 
ye yy (4.102) 
An explicit representation of the T matrix is given by: 
T =iy!y? (4.103) 
where: 
T=PeTS ==T* (4.104) 


We should also mention that the 7 operator is unusual because it is antiunitary. 
For example, consider the time evolution equation: 


[H, o(x)] = it (4.105) 
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Let us make a time-reversal transformation on this equation. If we reverse the 
sign of t in this equation, we find: 


[H, o(x’)] = +i 2 (4.106) 


However, this has the net effect of transforming H — —H, which is illegal 
(because then we would have negative energy states). This difficulty did not 
happen for parity or charge conjugation because 0/dt did not change for those 


symmetries. 
Correspondingly we wish that.7 would reverse the exponent appearing in the 
time evolution operator: 


SFeiht-yG-! = etiH(n—n) (4.107) 


However, this is impossible if the Hamiltonian commutes with .7 . 
This means that the operator .7 must be antiunitary: 


(7 O|7 v) = (WI) (4.108) 


Or this operator contains yet another operator that can take the complex conjugate 
of any c number. If we postulate an operator that can reverse i — —i, then.7 can 
commute with the Hamiltonian yet still reverse the sign of ¢. 


4.6 CPT Theorem 


In nature, these discrete symmetries are violated. Parity is maximally violated by 

the weak interactions, and the combination C P is violated in K meson decays. 

However, there is a remarkable theorem that states that any quantum field theory is 

invariant under the combined operation of C PT°—’ under very general conditions. 
The theorem states that the Hamiltonian .% is invariant under C PT: 


(CPT) B(x) (CPT)! = H') (4.109) 
if the following two conditions are met: 


1. The theory must be local, possess a Hermitian Lagrangian, and be invariant 
under proper Lorentz transformations. 
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2. The theory must be quantized with commutators for integral spin fields and 
quantized with anticommutators for half-integral spin fields (i.e., the usual 
spin-statistics connection). 


Thus, although quantum field theories are easily written down that violate these 
discrete symmetries separately, any quantum theory obeying these very general 
features must be invariant under C PT. 

Various proofs of this powerful theorem have been proposed over the years. 
We will not review the results from axiomatic field theory, which are the most 
rigorous. Rather than give a detailed proof, we will show that the C PT theorem 
is satisfied for the spin 0, 1/2, and 1 fields that we have so far investigated. 

We first note that it is easy to show that the CPT operation changes any 
quantum field theory obeying these two assumptions in the following way: 


1. The coordinates change as follows: 


ae 8), (4.110) 


2. The Maxwell field transforms as follows: 


Fiyanr ' = AG) 
Paes” = — Axles 
SAY = Ae) (4.111) 


where x’ refers to the .7 or the .Y transformed variable. Thus, under the 
combined C PT, we have: 


Ay(x) > —Ay(—x) (4.112) 
3. A Dirac spinor transforms under C PT as: 
Wa — —i1W(—x)p(Y5Y0) pa (4.113) 
4. A Dirac bilinear changes as follows: 
VxX)OW(x) > (192) Or’) (4.114) 


where k denotes the number of Lorentz indices appearing in the matrix O. 
(This assumed the spin-statistics connection, since we had to push one spinor 
past another.) 
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5. Any even-rank tensor transforms into its Hermitian conjugate, and all odd- 
rank tensors transform into the negative of their Hermitian conjugates. 


6. All c numbers appearing in the theory are complex conjugated. 


The sketch of the proof now proceeds as follows. First, we observe that the 
Lagrangian is invariant under C PT: 


(CPT) F(x) (CPT)! = &(-x) (4.115) 


This is because the Lagrangian is a contraction of many tensors of different rank, 
but the sum of all the ranks must be even since .% is a Lorentz scalar. By (5), 
this means that the Lagrangian is transformed into its Hermitian conjugate. But 
since the Lagrangian is Hermitian by the first assumption, we find that “ must be 
C PT invariant. 

One possible loophole in this construction is if the Lagrangian contains an 
infinite number of derivatives. Then condition (1) would be difficult to obey. 
This loophole is closed by the assumption that ¥ is local. [Nonlocal theories 
containing terms like ¢(x)@(y) can be power expanded as follows: 


P(X)P(Y) = H(xJeO G(x) (4.116) 


Nonlocal theories therefore contain an infinite number of derivatives, which are 
excluded from our discussion by the first assumption. ] 

Now that the Lagrangian is invariant, let us analyze the transformation property 
of the Hamiltonian, which is defined as: 


KH = -Z+)" : 1,(x)b,(x) : (4.117) 


where the sum 7 is over both bosons and fermions. 
To calculate the transformation of the Hamiltonian under C PT, we recall the 
transformation properties of the boson and fermion fields under C PT: 


(CPT) ¢(x)(CPT) = e%¢,(—x) 
(CPT)¢,(x)(CPT) = —e!*$,(—x) 
(CPT)R-G)(CPE)) = —e xix) (4.118) 


To show that the Hamiltonian is invariant, we observe that the equal-time 
commutation relations for the bosons and fermions are preserved if the spin- 
statistics relation holds: 


Boson : [x-(x, 1), Ox) = —18 x — x), 


4.7. Exercises 123 
Fermion : {7-(x, t), Os(x’, t)} = 18° (x — x')6,, (4.119) 

Then we have ail the identities necessary to show that: 
(C PT) %(x) (CPT)! = H(—x) (4.120) 


which completes the proof. 

In summary, we have seen how to quantize the Maxwell spin-one field via the 
canonical method. The formulation is a bit clumsy because we must quantize the 
theory in the Coulomb gauge to eliminate the ghost states with negative norm. 
The Gupta—Bleuler formulation is more convenient because it is Lorentz covariant, 
but we must apply the ghost-killing constraint on the states. The reason for this 
complication is that the Maxwell theory is a gauge theory with the group U(1). 

In Chapter 5, we will analyze how to calculate scattering cross sections for 
QED with quantum field theory. 


4.7 Exercises 


1. Prove that the radial functions G,; and Fy; obey Eq. (4.36) starting with the 
Dirac equation for an electron in a Coulomb field. 


2. There are no finite-dimensional unitary representations of the Lorentz group, 
but what about a vector field A,,, which forms a finite-dimensional represen- 
tation space for the Lorentz group. Does this violate the no-go theorem? Why 
not? 


3. There are very few exactly solvable problems involving the classical Dirac 
equation. One of them is the electron in a Coulomb potential. Another is the 
electron in a constant magnetic field. Solve the Dirac equation in this case, 


and show that: 
E=,/m? + p? + 2nleB| (4.121) 


where n = 0,1, 2,.... 


4. Prove that: 
Pop, s)\P—' =b(-p,s);_ Pd'(p,s\P"'=—-d'(—p,s) (4.122) 


Prove that an explicit operator representation for 7 is given by: 


= exp(~ ‘= ii d? p[b'(p, s)b(p, s) — b'(p, s)b(~p, 8) 
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+dl(p, s)d(p, s) +d'(p, s)d(—p, s)]) (4.123) 
5. Prove that: 


Eb(p,s)€—' = d(p,s)e'® 
Gale.s\o b'(p, se’? (4.124) 


Prove that this operation, in turn, can be generated by the operator expression: 


G =exp (= fer Y [o'@, 5) = dlp, 8)] bp, 8) = ap, 991) 


(4.125) 
(for g = 0). 
6. Prove Eq. (4.66). 
7. Show that: 
(CPRUGINCEL) '=UCr 45) (4.126) 
where U is the operator e~'”' for constant H, so therefore: 
(CPT)SCGPT) 77s! (4.127) 


where S is the § matrix. 


8. Consider the magnetic field created by electrons both moving in a straight 
wire and circulating in a solenoid. Also consider the electric field created by 
a stationary electron. Draw the diagrams for these system when a C, P, and 
T transformation is applied to them. Show that we reproduce the results in 
Eq. (4.84). 


9. Choose the Ap = O gauge. Since Ao is no longer a Lagrange multiplier, 
this means that Gauss’s Law can no longer be applied; therefore, we cannot 
impose V - E = 0. This means we cannot reduce the system to the Coulomb 
gauge. Resolve this paradox. [Hint: show that Gauss’s Law commutes with 
the Hamiltonian, and then use Eqs. (1.30) and (1.31).] 


10. Given a Lagrangian: 


ef role beet (4.128) 


uv op 


what symmetries among C, P, T are broken? 


4.7. 


16. 


17. 
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. Consider the coupling: Wo, »wF"". Take the nonrelativistic limit of this 


expression, and show how this relates to the magnetic moment of the electron. 


. Prove that the inversion of the operator in Eq. (4.71) gives us the propagator 


in Eq. (4.72). 


. Prove that the canonical commutation relation in Eq. (4.58) is satisfied if 


we take the commutation relations Eq. (4.63) among the harmonic oscillator 
states. Show that transversality is preserved. 


. Itis often convenient to describe gauge theory in the mathematical language of 


forms (especially when working in higher dimensions). Let the infinitesimal 
dx be antisymmetric under an operation we call A; that is, dx“ A dx" = 
—dx" dx". Define the operator d = dx"d,. Prove that d is nilpotent, 
that is, d> = 0. Define the one-form A = A,dx". Define a p-form as 
WD = Wyypr--pp,dx"' Adx® A--- A dx, Show that dA = F, where F = 
Fyydx" \ dx“, where A is the vector potential and F is the Maxwell tensor. 


. Inn dimensions, define the Hodge operator +, which generates a duality 


transformation: 


l 
* (dx! Adx A-++Adx#r) = ———€y 5..." Ndxtr? A---Ndxh 


CaS Tp)! 
(4.129) 
for p <n. Show that Maxwell’s equations can now be summarized as: 


dF=0,; d*F=J (4.130) 


Show that the Bianchi identity is a consequence of d* = 0. 


Define the operator 6 as: 
8 =(-1)"?*"*" « dx (4.131) 
Then show that 5* = 0 and that the Laplacian is given by: 
A=(d+5) =di+3d (4.132) 


Prove that T“" in Eq. (4.17) is conserved. (Hint: use the Bianchi identity.) 


Chapter 5 


Feynman Rules 
and LSZ Reduction 


The reason Dick's physics was so hard for ordinary people to grasp was 
that he did not use equations. The usual way theoretical physics was done 
since the time of Newton was to begin by writing down some equations 
and then to work hard calculating solutions of the equations .... He had 
a physical picture of the way things happen, and the picture gave him the 
solution directly with a minimum of calculation. It was no wonder that 
people who had spent their lives solving equations were baffled by him. 
Their minds were analytical; his was pictorial. 

—F. Dyson on R. Feynman 


5.1 Cross Sections 


So far, our discussion has been rather formal, with no connection to experiment. 
This is because we have been concentrating on Green’s functions, which are 
unphysical; that is, they describe the motion of “off-shell” particles where Pe -d 
m*. However, the physical world that we measure in our laboratories is on-shell. 
To make the connection to experiment, we need to rewrite our previous results in 
terms of numbers that can be measured in the laboratory, such as decay rates of 
unstable particles and scattering cross sections. There are many ways in which 
to define the cross section, but perhaps the simplest and most intuitive way is to 
define it as the effective “size” of each particle in the target: 


Cross section = Effective size of target particle (5.1) 


The cross section is thus the effective area of each target particle as seen by an 
incoming beam. Cross sections are often measured in terms of “barns.” (One barn 
is 10-24 cm?.) A nucleon is about one fermi, or 10~'? cm across. Its area is 
therefore about 10~2° cm’, or about 0.01 barns. Thus, by giving the cross section 
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Figure 5.1. A target with Nr nuclei is bombarded with a beam with Nz particles. The 
cross section o is the effective size in cm? of each nuclei as seen by the beam. 


of a particle in a certain reaction, we can immediately calculate the effective size 
of that particle in relationship to a nucleon. 

To calculate the cross section in terms of the rate of collisions in a scattering 
experiment, let us imagine a thin target with N7 particles in it, each particle with 
effective area o or cross section. As seen from an incoming beam, the total amount 
of area taken up by these particles is therefore Nro. If we aim a beam of particles 
at the target with area A, then the chance of hitting one of these particles is equal 
to the total area that these target particles occupy (Na) divided by the area A: 


N 
Chance of hitting a particle = = (5.2) 


Let us say we fire a beam containing Ng particles at the target. Then the 
number of particles in the beam that are absorbed or deflected is Ng times the 
chance of being hit. Thus, the number of scattering events is given by: 


N 
Number of events = Ng i (5.3) 
or simply: 
— number of events 54 
= Won (5.4) 


This reconfirms that the cross section has dimensions of area (Fig. 5.1). 

In actual practice, a more convenient way of expressing the cross section is 
via the flux of the incoming beam, which is equal to pv. If the beam is moving 
at velocity v toward a stationary target, then the number of particles in the beam 
Nz is equal to the density of the beam p times the volume. If the beam is a pulse 
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that is turned on for ¢ seconds, then the volume of the beam is vtA. Therefore, 
Neg = pvtA. The cross section can therefore be written as: 


number of events/t 
(ovtA)Nr/t 
number of events/t 
pv 
transition rate 


a a (oe) 


where we have normalized Nz to be | and the transition rate is the number of 
scattering events per second. The cross section is therefore equal to the transition 
rate divided by the flux of the beam. (More precisely, the cross section is equal 
to the transition probability per scatterer in the target and per unit incident flux.) 
This is usually taken as the starting point in discussions of the cross section, but 
unfortunately it is rather obscure and does not reveal its intuitive meaning as the 
effective size of the target particle. 

(We will be calculating the cross section in collinear Lorentz frames, i.e., 
where the incoming beam and target move along the same axis. Two common 
collinear frames are the laboratory frame and the center-of-mass frame. If we 
make a Lorentz transformation to any other collinear frame, the cross section is an 
invariant, since a Lorentz contraction does not affect the cross section if we make 
a boost along this axis. However, the cross section is not a true Lorentz invariant. 
In fact, it transforms like an area under arbitrary Lorentz transformations.) 

The next problem is to write the transition rate appearing in the cross section in 
terms of the S matrix. We must therefore calculate the probability that a collection 
of particles in some initial state i will decay or scatter into another collection of 
particles in some final state j. From ordinary nonrelativistic quantum mechanics, 
we know that the cross section o can be calculated by analyzing the properties 
of the scattered wave. Using classical wave function techniques dating back to 
Rayleigh, we know that a plane wave e’*’ scattering off a stationary, hard target is 
given by: 


Ay. 
elkz fe f( ) pike (5.6) 
is 
where the term with e!*” represents the scattered wave, which is expanding radially 


from the target. Therefore | f(@)|* is proportional to the probability that a particle 


scatters into an angle 0. 
More precisely, the differential cross section is given by the square of f(@): 


ad =|f()/" (5.7) 
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where the solid angle differential is given by: 


2x I 
fao- f ao | dcos@ (5.8) 
0 =\ 


and the total cross section is given by: 


De do _ 
fanireo - f ane =o (5.9) 


For our purposes, however, this formulation is not suitable because it is inherently 
nonrelativistic. To give a relativistic formulation, let us start at the beginning. We 
wish to describe the scattering process that takes us from an initial state consisting 
of acollection of free, asymptotic states at t — —oo toa final state | f) at t — oo. 
To calculate the probability of taking us from the initial state to the final state, we 
introduce the S$ matrix: 


(f|S|t) 
54; — i(20)*8*(Py — PF fi (5.10) 


Syi 


where 5; symbolically represents the particles not interacting at all, and 7%; is 
called the transition matrix, which describes non-trivial scattering. 

One of the fundamental constraints coming from quantum mechanics is that 
the § matrix is unitary: 


> SF Spe = Sik (5.11) 
a 


By taking the square of the S matrix, we can calculate the transition probabilities. 
The probability that the collection of states i will make the transition to the final 
states f is given by: 


Pe = SF, Sfi (5.12) 


Likewise, the total probability that the initial states i will scatter into all possible 
final states f is given by: 


Foi yee SS: (5.13) 
ip 


Now we must calculate precisely what we mean by )~ y- We begin by defining 
our states within a box of volume V: 


Ip) = y (22 92E,/Va'(p)|0) (bosons) 
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Ip) = J@2x¥(E,/mV)at(p)|p) (fermions) (5.14) 


Our states are therefore normalized as follows: 


(p|p’) (21)°2E,5°(p — p’)/V__ (bosons) 


E 
(p|p’) (20) —*5°(p — p')/V (fermions) (5.15) 


With this normalization, the unit operator (on single paricle states) can be 
expressed as: 


Vd>p 
I (On)32E, |P)(p| (bosons) 
Vapm ' 
i (n)3 E, ») (p| (fermions) (5.16) 


To check our normalizations, we can let the number one act on an arbitrary state 
|q), and we see that it leaves the state invariant. This means, however, that we 
have an awkward definition of the number of states at a momentum p. With this 
normalization, we find that: 


(plp) = (27)°2E,8°(0)/V (5.17) 
which makes no sense. However, we will interpret this to mean that we are 


actually calculating particle densities inside a large but finite box of size L and 
volume V; that is, we define: 


1 L/2 
op) Selim / i / dxdydz e~'P* (5.18) 
G L—00 (as Ep : 
This implies that we take the definition: 


V 
(2x) 


56°(0) = 


We will let the volume of the box V tend to infinity only at the end of the 
calculation. The origin of this problem is that we have been dealing with plane 
waves, rather than wave packets that are confined to a specific region of space 
and time. The price we pay for these nonlocalized plane waves is that we must 
carefully divide out infinite quantities proportional to the volume of space and 
time. (A more careful analysis would use wave packets that are completely 
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localized in space and time; i.e., they have an envelope that restricts most of the 
wave packet to a definite region of space and time. This analysis with wave 
packets is somewhat more complicated, but yields precisely the same results.) 

Our task is now to calculate the scattering cross section 1+2 — 3+4+-:: 
and the rate of decay of a single particle 1 — 2+3-::. 

We must now define how we normalize the sum over final states. We will 
integrate over all momenta of the various final states, and sum over all possible 
final states. For each final state, we will integrate over the final momentum in a 
Lorentz covariant fashion. We will use: 


(is eer 2 bs dp 
/ Soi8(p? — m*)0(p0) = i) gos (5.19) 


The density of states dN ;, that is, the number of states within p and p + dp, is: 


Ng 3 
Vd’p 
dN; =| | ——~ es 
[any - 
As before, the differential cross section do is the number of transitions per 
unit time per unit volume divided by the flux J of incident particles: 


transitions per second per cm? 


incident flux 
Sril?dNe\ 1 
= (Sa fi L) = (5.21) 


ao = 


VE J 


We also know that the transition rate per unit volume (within a momentum- 
space interval) is given by: 


[Srila Ny 
VT 
(27) |.Fi|*8*(P ys — ON 
VE f 
(27)*54(P¢ — Pi). Fil?dN ¢ (6.22) 


Transition rate per cm? 


where (27r)*54(0) = VT. 

To calculate the incident flux, we will first take a collinear frame, such as the 
laboratory frame or center-of-mass frame. The incident flux J equals the product 
of the density of the initial state (1/V) and the relative velocity v = |v, — vl, 
where v; = |p; |/E1: 


J =|v, —v9|/V (5.23) 
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In the center-of-mass frame, where p; = —p2, we have: 


VQE))QE2)J 


i} 


(2E)(2E>)|v; — v2| = 2B 2B | = ™ 


4|p, £2 — poE| | 
4|pi|(E) + Er) = 41 (pi - po)? — mim3]'/?_— (5.24) 


(The last step is a bit deceiving, since it appears as if the flux is a Lorentz invariant 
in all frames. The last equality only holds if the two particles are collinear. The last 
step is not necessarily true for arbitrary Lorentz frames in which the two particles 
are not collinear. To see this, we can also write the flux for a beam moving in the 
z direction as: 

J ~ €xypvP} P2 (5.25) 


The flux now transforms as the x, y component of an antisymmetric second-rank 
tensor; that is, as an inverse area. Since the cross section is an area as seen by a 
particle in the beam, we expect it should transform as an area under an arbitrary 
Lorentz transformation. From now on, we assume that all Lorentz frames are 
collinear, so we can drop this distinction.) 

The final formula for bosons for the differential cross section for 1 +2 — 
3+4--- is therefore given by: 


_ On) | My POPs = PT api 


do aa 
Apo Ce 


(5.26) 
4[(pi - po)? —m2m 


in a collinear frame where .F; = [],(2Ep, V)~'/.4,i. Notice that all factors 
of V in the $ matrix have precisely cancelled against other factors coming from 
dN and the flux. 

[We should note that other normalization conventions are possible. For ex- 
ample, we can always change the fermion normalization such that the 2m factor 
appearing in Eqs. (3.109) and (3.110) disappears. Then, with this new normal- 
ization, Eq. (5.26) works for both fermions and bosons. The advantage of this is 
that we can then use Eq. (5.26) without having to make the distinction between 
boson and fermion normalizations. | 

Finally, we will use this formalism to compute the probability of the decay of 
a single particle. The decay probability is given by: 


> f ansisa? 
ii 


a ih AN |Fil? [On)'34(Py — PD] (5.27) 
J 


MN 


P, total 
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where we have taken 5,; = 0 for a decay process. The last expression, unfor- 
tunately, is singular because of the delta function squared. As before, however, 
we assume that all our calculations are being performed in a large but finite box 
of volume V over a large time interval T. We thus reinterpret one of the delta 
functions as: 
VI 
5*(0) = ——; 5.28 
O= B= (5.28) 
We now define the decay rate I’ of an unstable particle as the transition probability 
per unit volume of space and time: 


transition probability 


[T = 
sec X volume 


al (5.29) 
VT(Q2E;) 


The final result for the decay rate is given by: 
r= SF fT] dp; a 

= M6 ¢i\ 5" ( Ps — Pi 5.30 

Ll aag| pees — A) =”) 


The lifetime of the particle t is then defined as the inverse of the decay rate: 


T= (31) 


i 
ir 


5.2 Propagator Theory and Rutherford Scattering 


Historically, calculations in QED were performed using two seemingly indepen- 
dent formulations. One formulation was developed by Schwinger! and Tomonaga? 
using a covariant generalization of operator methods developed in quantum me- 
chanics. However, the formulation was exceedingly difficult to calculate with 
and was physically opaque. The second formulation was developed by Feynman? 
using the propagator approach. Feynman postulated a list of simple “rules” from 
which one could pictorially setup the calculation for scattering matrices of arbi- 
trary complexity. The weakness of Feynman’s graphical methods, however, was 
that they were not rigorously justified. Later, Dyson demonstrated the equiva- 
lence of these two formulations by deriving Feynman’s rules from the interaction 
picture. 

In this chapter, we first follow Feynman to show how the propagator method 
gives us a rapid, convenient method of calculating the lowest order terms in the 
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scattering matrix. Then we will develop the Lehmann—Symanzik—-Zimmermann 
(LSZ) reduction formalism, in which one can develop the Feynman rules for 
diagrams of arbitrary complexity. 

At this point, we should emphasize that the Green’s functions that appear in 
the propagator approach are “off-shell”; that is, they do not satisfy the mass-shell 
condition Dp; = m°. Neither do they obey the usual equations of motion. The 
Green’s functions describe virtual particles, not physical ones. As we saw in the 
previous chapter, the Green’s function develops a pole in momentum space at 
p:, = m°. However, there is no violation of cherished physical principles because 
the Green’s functions are not measurable quantities. The only measurable quantity 
is the S matrix, where the external particles obey the mass-shell condition. 

To begin calculating cross sections, let us review the propagator method in 
ordinary quantum mechanics, where we wish to solve the equation: 


rd) 
i——H =0 D2 
( a ) y (5.32) 
We assume that the true Hamiltonian is split into two pieces: H = Ho + H,, 
where the interaction piece H; is small. We wish to solve for the propagator 
Gx): 


(i — Hy -— i) G(x, UX ,t = Gees et ) (5.33) 
If we could solve for the Green’s function for the interacting case, then we can 
use Huygen’s principle to solve for the time evolution of the wave. We recall 
that Huygen’s principle says that the future evolution of a wave front can be 
determined by assuming that every point along a wave front is an independent 
source of an infinitesimal wave. By adding up the contribution of all these small 
waves, we can determine the future location of the wave front. Mathematically, 
this is expressed by the equation: 


w(x, t) = [ex'Ge. Px of wx): et (5.34) 


Our next goal, therefore, is to solve for the complete Green’s function G, which we 
do not know, in terms of the free Green’s function Go, which is well understood. 
To find the propagator for the interacting case, we have to power expand in H7. 
We will use the following formula for operators A and B: 
1 1 
AaB | ACT AB) 

1 1 

1+ AIBA 
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(k= AB SAT BA Bane 


A =A "BA! + Aw Reena] + (5.35) 


Another way of writing this is: 


1 a eel 


aa as BAS 5.36 
A+B a A+B ( ) 
Now let: 
a) 
Ae — Ho +t 


pi = —H, 
Ga Aap) 
Gao = yA (5.37) 


Then we have the symbolic identities: 


G 
G 


il 


Go + GH,Go 
Go + GoH; Go + GoH;GoH;Go+::: (5.38) 


WW 


More explicitly, we can recursively write this as: 


G(x, t:x’,t') = Gotxrix t+ fidn f dx G(x, t;X1, t1) 
% A(x, fi) Goes, 4X 0) (5.39) 


If we power expand this expression, we find: 


G(x, t;x’,t') = Go(x,t;x’, t’)+ i dt / d>x, 
X Golx, t3X1, 1)As(x1, )Go(X1, 41; x’, t’) 
fo 2) 
- i dt, dt, d>x, d°x2Go(x, t:X1, t))H;(x1, ty) 
—oo 


X Go(X1, ti; X2, t2)H1(X2, t2)Go(Xo, ta; x’, t') + -- (5.40) 


Using Huygen’s principle, we can power expand for the time evolution of the 
wave function: 


(x,t) = Yolx,t)4 i CG 
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Figure 5.2. In the propagator approach, perturbation theory can be pictorially represented 
as a particle interacting with a background potential at various points along its trajectory. 


+ 7 d*x, d*x2 Go(x, t:X1, t1)H7(X1, th) Go(X1, t13X2, t2)Wo(X2, tr) 


tet f diets, Go(x, t3%1, tH) (x1, th) --- 


Ki GgGeai.th-33 knatn) Wolk ta (5.41) 


In Figure 5.2, we have a pictorial representation of this diagram. 

Let us now solve for the S$ matrix in lowest order. We postulate that at infinity, 
there are free plane waves given by ¢ = e7'**/(27)*/?.. We want to calculate 
the transition probability that a wave packet starts out in a certain initial state i, 
scatters off the potential, and then re-emerges as another free plane wave, but ina 
different final state f. To lowest order, the transition probability can be caiculated 
by examining Huygen’s principle: 


w(x, t) = bi (x, ne f dts Go(x, t;x’, t')H7(x’, t')@i(x’, t’) +--- (5.42) 


To extract the S matrix, multiply this equation on the left by $7 and integrate. 
The first term on the right then becomes 6;;. Using the power expansion of the 
Green’s function, we can express the Green’s function Go in terms of these free 
fields. After integration, we find: 


Se = op +i / d*x be(x')Hy(x')pi(x!) +> (5.43) 


Therefore, the transition matrix is proportional to the matrix element of the po- 
tential H;. We will now generalize this exercise to the problem in question: the 
calculation in QED of the scattering of an electron due to a stationary Coulomb 
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Figure 5.3. An electron scatters off a stationary Coulomb field in lowest order perturbation 
theory. This reproduces the Rutherford scattering cross section in the nonrelativistic limit. 


potential. Our calculation should be able to reproduce the old Rutherford scat- 
tering amplitude to lowest order in the nonrelativistic limit and give higher-order 
quantum corrections to it. Our starting point is the Dirac electron in the presence 
of an external, classical Coulomb potential (Fig. 5.3). 

The interacting Dirac equation reads: 


(iy"d, — m)p(x) = e A(x) p(x) (5.44) 


Since we are only working to lowest order and are treating the potential A,, as 
a classical potential, we can solve this equation using only propagator methods. 
The solution of this equation, as we have seen, is given by: 


Gy oR) +e | d*y Se(x — y) AGH) (5.45) 


where Wo is a solution of the free, homogeneous Dirac equation. To calculate 
the scattering matrix, it is convenient to insert the expansion of the Feynman 
propagator S-(x — x’) in terms of the time-ordered function 6(t — t’), as in Eq. 
(3.137). Then we find: 


2 
U(x) = Wil) — ie / dy i9(¢ —0') ii Bp TOW) AO) 6.46) 
r=l 


for tf — oo. We now wish to extract from this expression the amplitude that the 
outgoing wave w(x) will be scattered in the final state, given by y;(x). This is 
easily done by multipiying both sides of the equation by yw, and integrating over 
all space-time. The result gives us the § matrix to lowest order: 


Ssi = Of; —ie | d* xy p(x) A(x) Wi(x) (5.47) 
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Now we insert the expression for the vector potential, which corresponds to an 
electric potential Ag given by the standard 1/r Coulomb potential: 


Ze 


Ao(x) = ~ arcix (5.48) 
whose Fourier transform is given by: 
3 

el (5.49) 


—_eé — es 
|x| lq/? 


Inserting the plane-wave expression for the fermion fields into the expression for 
the scattering matrix in Eq. (5.47) and performing the integration over x, we have: 


iZe* m ; 1 m ; 
Sri = d‘ —U ; HA es Leal pels em 
f a / x y E,utPr spe E | | Et? sje 


iZe? | m2 ips, s;)y°u(pi, si) 
= 27 (Es — E;) (5.50) 
V \Y Efe; lq/? , 


We recall that Vd*p;/(2:r)° is the number of final states contained in the 
momentum interval d*p;. Multiply this by |S,;|* and we have the probability 
of transition per particle into these states. [We recall that squaring the S matrix 
give us divergent quantities like 5(0), which is due to the fact that we have not 
rigorously localized the wave packets. We set 2775(0) = T, where we localize the 
scattering process in a box of size V and duration T.] 

If we divide by 7, this give us the rate R of transitions per unit time into this 
momentum interval. Finally, if we divide the rate of transitions by the flux of 
incident particles |v;|/ V, this gives us the differential cross section: 


Vd? py 1 


= A ciate = 
a ler Es VT\vi\/V 


(351) 


To calculate the differential cross section per unit of solid angle, we must 
decompose the momentum volume element: 


dp =dQp’ dp (552) 
Using the fact that py dp = Ey dE;, we have the result: 


do 4Z70?m? 1 0 2 
ee ee eG ya s)| 
dQ lq\* y 2 


4Z7a2m? 1 ovitm oPstm 
Dre | 50 EE ate 5.53 
iq? 2, ( i On ) co) 
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In the last step, we have used the fact that the summation of spins can be 
written as: 


a Tu; i; = (Tu) (uirta) 
(Tui) (ail yous) 


pal a,pui,pui,y (mI y0), 5 U fs 


(8) (HEM), tt, 


I 


| 
oF 
= 
S| + 
= 
5 
= 
+ 
= 


(5.54) 


where we have use the fact that the sum over spins in Eq. (3.109) gives us: 


spins 2m 


> up(p, s)iia(p, 5) = (45*) 
Ba 


G55) 


The last trace can be performed, since only the trace of even numbers of Dirac 
matrices survives: 


Try® piv? pp = 4(2E:Es — pi Pp) (5.56) 


Finally, we need some kinematical information. If 6 is the angle between p; and 
p;, then: 


Pi: pf = m*+2B'E” sin?(6/2) 
la? = 4/p/? sin°(@/2) (5.57) 
We then obtain the Mott cross section‘: 
do Za? Q 
—~ = ~~ | 1 — f’ sin? ( — 
dQ ol Posi (5) (5.58) 


In the nonrelativistic limit, as 6 — 0, we obtain the celebrated Rutherford 
scattering formula. 
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5.3. LSZ Reduction Formulas 


Up to now, we have made the approximation that certain fields, such as the 
electromagnetic field, were classical. The development of scattering theory in this 
approximation was rather intuitive. 

In this chapter, we would like to introduce a more convenient method, the 
LSZ reduction formalism,> from which we can derive scattering amplitudes to all 
orders in perturbation theory. The LSZ method gives us a simple derivation of 
Feynman's rules, which were originally derived from a more intuitive approach 
using propagator theory. 

The LSZ approach begins with the physical § matrix, making as few assump- 
tions as possible. We start by defining the “in” and “out” states, which are free 
particle states at asymptotic times; that is, ¢ = —co and oo, respectively. We 
choose to distinguish them from the intermediate states, which are defined off- 
shell and are interacting. Our goal is to express the interacting S matrix, defined 
in terms of the unknown interacting field ¢(x), in terms of these free asymptotic 
states. (We caution that in certain theories, such as QCD, the asymptotic states 
are bound and do not correspond to free states.) 

The S matrix is defined as the matrix element of the transition from one 
asymptotic set of states to another. Let f denote a collection of free asymptotic 
states at t = oc, while i refers to another collection of asymptotic states at t = —oo. 
Then the S matrix describes the scattering of the i states into the f states: 


Si = om S|? )in (5.59) 


We postulate the existence of an operator S that converts asymptotic states at 
t = oo to states at tf = —oo: 


kere = Se oat 
Sfi out (f|S|i)out = in(f|S|i)in (5.60) 


For these asymptotic states, we also have asymptotic fields ¢j, and ¢oy that are 
free fields. Thus, we can use the machinery developed in Chapter 4 to describe 
these asymptotic free states. In particular, we can use Eqs. (3.15) and (3.16) to 
define the state vector as the vacuum state multiplied by creation operators: 


ai,(q)\0 )in 


-i / dx e4(x) Bo Pin(X)|0) in (5.61) 


l7)in 
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The goal of the LSZ method is to reduce all expressions involving the full, 
interacting field @(x) (which has matrix elernents that cannot be easily computed) 
into simpler expressions involving the free, asymptotic fields ¢in and dou. The 
S matrix, written as a transition matrix, is useless to us at this moment. The 
goal of the LSZ approach, therefore, is to continue this process until we have 
gradually extracted out of the S matrix the entire set of fields contained within 
the asymptotic states. Then we can use the machinery developed in the previous 
chapter to manipulate and reduce these fields. 

There is, however, a subtle point we should mention. Naively, one might 
expect that the interacting field @(x), taken at infinitely negative or positive times, 
should smoothly approach the value of the free asymptotic fields, so that: 


x9 + -00;  o(x) > Z'?O(x)in (5.62) 


where the factor Z'/* arises because of renormalization effects (which will be 
eliminated in Chapter 7). However, this naive assumption is actually incorrect. 
If we take this “strong” assumption, then it can be shown that the S$ matrix 
becomes trivial and no scattering takes place. We must therefore take the “weak” 
assumption, that the matrix elements of the two fields #(x) and ¢j, approach each 
other at infinitely negative times; that is: 


x9 -+-00; — (fld(x)|i) > Z'°(f ldin(x)I i) (5.63) 


For the moment, however, we will simply ignore the complications that arise due 
to this. We will return to the question of evaluating Z later. 

Let us now take an arbitrary S matrix element for the scattering of m particles 
with momenta q; into n particles with momenta p;. We first extract the field 


ees 99 


¢in(q1) from the asymptotic “in” state: 


out (PA }Upyore Pri, (ply Fo Givin = out Pe jb 22S Palat(aila2: q3,°°'dm a 
=—1 lim ax €q,(x) Oo out (P1, P2s°**s Pal@(X)inig2, 93,°°°; Gavin 


= =18) 


(5.64) 


Next, we wish to convert the three-dimensional integral f d>x into a four dimen- 
sional one. We use the identity: 


Sa 
(in a im_) / d?x A(x,t) = / dt if d°x A(x, t) (5.65) 


Since we already have a term at tf —+ —oo, we add and subtract the same term as 
t — oo. This therefore gives us an integral over four dimensional space-time, 
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plus a term at t — oo: 


out(P1, P2,°**s Paldi; Cn 
+ iz? fats do [ea(x) d0 out (Bis Pas ***16@DIq25 -* in| 


~ da Jim ax [ea(x) 00 ne [epg OS |d(x)|q2, ioe ‘in| 


(5.66) 


This last term, in turn, can be written as the creation operator of an “out” state: 


—iZ~"? lim f d?x [eay(x) 90 out(P1, Pas |6@)Iq25°*-)in| 


too 


= ou (Pi Pr, -+*|@i(qi)|q2.->-)in (5.67) 


The price of converting a three-dimensional integral to a four-dimensional one 
is that we have now generated a new term at  — oo, whichis the matrix element of 
an “out” operator ah ). Because this “out” operator is an annihilation operator 
if it acts to the left, in general it gives us zero. The only exception is if there 
is, within the collection of “out” states, precisely the same state with momentum 
q,. Thus, the matrix element vanishes unless, for example, the ith state with 
momentum p; has exactly the same momentum as q: 


out(P1s P2**+ [aby (91) 192593 °**)in = > 2pP(2r)°S*(p; — gu) 


i=1 


x out(P1,°** Pi-* >> Pn|G2. °°" Om)in (5.68) 


where the caret over a variable means that we delete that particular variable. This 
term is called a “disconnected graph,” because one particle emerges unaffected by 
the scattering process, and is hence disconnected from the rest of the particles. 
Now, we come to a key step in the calculation. So far, the expressions are 
noncovariant because of the presence of two time derivatives. We will now 
convert the time derivatives in the reduced matrix element into a fully covariant 
object, thus restoring Lorentz invariance. This is possible because the operator 
93 — 4? + m? annihilates the plane wave exp(—iq, - x). By integrating by parts, 
we will be able to convert the various time derivatives into a fully covariant 7: 


[as do [ea(x) ap out (P1,°** |P(x)|@2, in| 
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a / dx [BBeq,(2)] out (is 16) Ia25 °° in 


Hi fas €q,(X) 9G out (Pr, *** P(x) |92, °° +)in 


/ d*x [(-a? +m’ eq, (x)] out (P1, ie |p(x)|@2, i an 


+ fas €q, (x) af oP is eae \p(x)|g2, faa ‘din 


fas &q, (x (Oz, a m’) out (P1 as l@(x)\q2, ae an (5.69) 
Collecting our results, we find: 


out(P1, P2s°**s PnlGi, 92. °**s Qm)in = disconnected graph 


ZV f atx en (2) (Bf +m?) oar Pas = |BC3D|Q2.~-bi 


(5.70) 


We have now completed the first step of the LSZ program. By extracting the 
state with momentum gq, from the “in” states, we now have reduced an abstract 
S matrix element into the matrix element of a field ¢. Our goal, obviously, 
is to continue this process until all the asymptotic fields are extracted from the 
asymptotic states. 

A small complication emerges when we extract out a second field from the 
“out” state with momentum p,. We find that we must adopt the time-ordered 
product of two fields (otherwise, we cannot make the transition from time deriva- 
tives to the Lorentz covariant derivatives). Repeating the identical steps as before, 
and including this important feature of time ordering, we find: 


out (P1,°**|P(%1)|92, °° +)in = out (P25 ° ++ |OCrr)ain(p1) 192, ***)in 


2 iz? f aty ep, (M1) (82 +m7)y, ou (P2,° + |TO(1)O(x1) 192° *)in 


(5.71) 


As before, the a;, operator now acts to the right, where it is an annihilation operator 
that destroys a state with exactly momentum pj, generating a disconnected graph. 
Completing all steps, we now have the twice-reduced matrix element: 


out(P1,°°*| 41 -°+-)in = disconnected graphs 
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i 2 
+ iz) / d*x, d*y, e* (yi eg (x1) (82 +m) y, (82 +m?),, 


x out (P2; sao IT O(y1) (x1) | 92, aie -)in (972) 


It is now straightforward to apply this reduction process to all fields contained 
within the asymptotic state vectors. After each reduction, the only complication is 
that we generate disconnected graphs and we must be careful with time ordering. 
The final result is: 


out (P1, P2,°**s Pn|Gis 92) °**+Qm)in = disconnected graphs 


a GzZo2) fan Bie ida LT] [4.00.1 


is] j=l 


x (87 +m’), --- (8; +m), (O|T (91): P%m)10) (5.73) 


(In general, we can choose momenta so that the disconnected graphs are zero. 
For values of the momenta where the disconnected graphs are non-zero, we can 
use this formalism to reduce them out as well.) 


5.4 Reduction of Dirac Spinors 


It is now straightforward to apply this formalism to the reduction of fermionic 
S matrices. Following the steps of the bosonic case, we write the creation and 
annihilation operators in terms of the original Dirac field. Then we write the 
asymptotic condition as: 


x9 — —oo; (f |W li) @ Z27 2 (f |Win(x)Ii) (5.74) 


Next, we write the § matrix and reduce out one creation operator using Eq. (3.116): 


out (FIDL, €)|é)in 


= -i lim | d?x on( flO) li)inUcx, © 


[$3 e.9) 


out (fk, i)in 


= ene. €)li)in 


SVE i ax oust Bayes €) 
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+ ou (flPCOi)in(-i) 8 oy Ue, o| | (5.75) 


As before, we must convert the y°o into the covariant y“d,,. This is accomplished 
because we can replace the time derivative with a space derivative (because the 
spinor u(k, €)e~‘** satisfies the Dirac equation). In this way, we can now write 
down four different types of reduction formulas, depending on which creation or 
annihilation process we are analyzing. 

The reduction formulas after a single reduction now read as follows: 


ott FIDL €)|i)in = “iZy'? f ats out (f |W (x)|i)in(—i a —m)U,(x, €) 
out (f Idin(k, €)li)in = 44Z27"”° / dx Vix, €)E J —m) on f1WOIi)in 
out f [Boulk, €)lé)in = —éZQ7"? / dx x(x, MiG —m) our (f |WOr)Ii)in 
out(f dowlk, €)i)in = 41Z27'" / aX ou F/I )in(—i J —m)Vi(x, €) 
(5.76) 
where we have dropped the disconnected graph. 

Making successive reductions, until all creation and annihilation operators 
are reduced out, is also straightforward. As before, we find that we must take 
time-ordered matrix elements of the various fields. Let us take the matrix element 
between incoming particles and outgoing particles. The incoming particles are 
labeled by p1, p2, ..., while the incoming antiparticles are labeled by p}, p5,.... 
The outgoing particles are labeled by q1, gz, ..., while the outgoing antiparticles 


are labeled by q}, q3,.... 
The matrix element, after reduction, yields the following: 


out (O|Dour(q1) «+ - dour(9{) = == bi (p1) ++ dh(pt) «++ |0)in 
= (aiZer eae il fae: dba Sed yee -d*y! Bi « 
Og, F — m)y, Voi iy" F — m)y 
x (OT [VOD :- WO: een): WO) ---] (0) 


x (iF — m)x, Up, (a1) (-E F — m)y Vy) (5.77) 
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where we have discarded the disconnected graphs. In this way, we can reduce out 
even the most complex scattering processes in terms of the reduction formulas. 


5.5 Time Evolution Operator 


The LSZ reduction formulas have been able to convert the abstract S matrix 
element into the product of vacuum expectation values of the fully interacting 
fields. No approximations have been made. However, we still do not know how 
to take matrix elements of the interacting fields. Hence, we cannot yet extract out 
numbers out of these matrix elements. The problem is that everything is written 
in terms of the fully interacting fields, of which we know almost nothing. The 
key is now to make an approximation to the theory by power expanding in the 
coupling constant, which is of the order of 1/137 for QED. We begin by splitting 
the Hamiltonian into two distinct pieces: 


H = Ho+H; (5.78) 


where Hp is the free Hamiltonian and H, is the interacting part. For the $4 theory, 
for example, the interacting part would be: 


H, = fox% 
A 44 
KH = Fe (5.79) 


At this point, it is useful to remind ourselves from ordinary quantum mechanics 
that there are several “pictures” in which to describe this time evolution. In the 
Schrédinger picture, we recall, the wave function w(x, t) and state vector are 
functions of time t, but the operators of the theory are constants in time. In 
the Heisenberg picture, the reverse is true; that is, the wave function and state 
vectors are constants in time, but the time evolution of the operators and dynamical 
variables of the theory are governed by the Hamiltonian: 


(x, t) =e!" h(x, O)e 1" (5.80) 


In the LSZ formalism, we will find it convenient to define yet another picture, 
which resembles the interaction picture. In this new picture, we need to find a 
unitary operator U(t) that takes us from the fully interacting field p(x) to the free, 
asymptotic “in” states: 


b(t, x) = U"(t)din(t, XU (t) (5.81) 
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where U(t) = U(t, —o«) is a time evolution operator, which obeys: 


U(t, t2)U (ta, 3) = Ut, t3) 
U-"h,6) = Ulam) 
U(t, t) 1 (5.82) 


Because we now have two totally different types of scalar fields, one free 
and the other interacting, we must also be careful to distinguish the Hamiltonian 
written in terms of the free or the interacting fields. Let H(z) be the fully interacting 
Hamiltonian written in terms of the interacting field, and let Ho(¢in) represent the 
free Hamiltonian written in terms of the free asymptotic states. Then the free field 
Gin and the interacting field satisfy two different equations of motion: 


i[H(t), b(t, x)] 


i 


) 
37 Ps x) 


F) 
57 Pint, x) i[H', Pin(t, x)] (5.83) 


To solve for U(t), we need to extract a few more identities. If we differentiate the 
expression UU—! = 1, we find: 


ad -1 errs oe 
E vo U ()+U@H—U (t) =0 (5.84) 


Now let us take the derivative of ¢,, and use the identities that we have written 
down: 


= Gilt x) ~ [U@eC, x)U-"] 

= U(t)p(t, XU! + U(t)ot, NU") + UO, HU") 

= U(t)(U~'dinV) U- + Ut) GH, 2), 6] U7! 
+UU'$i,UU 

= UU~'$i,+iU[H(¢, 2), @JU-'+¢,UU~! 

= [UU~'+iH@in, Xin), Gin] (5.85) 


The last expression, in turn, must equal i [iedARe Gin]. This means that the following 
expression commutes with every “in” operator, and hence must be a c number: 


UU! fey] [H(@in, Kin) — Hy =C number (5.86) 
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(This c number, one can show, does not contribute to the § matrix.) Thus, the 
U(t) operator satisfies the following: 


) 
Sree to) = H,(t)U(t, to) (5.87) 
where: 
H(t) = H(Gin, Xin) — HE (5.88) 


that is, H;(t) is defined to be the interaction Hamiltonian defined only with free, 
asymptotic fields. Since H;(t) does not necessarily commute with H,(t’) at 
different times, the integration of the previous equation is a bit delicate. However, 
one can show that: 


U@ = UG —co) 


T exp (-i ar Hi(n)) 


t 
T exp (-i / dt, i a> x, Fy1.1)) (5.89) 


where the operator T means that, as we integrate over t;, we place the exponentials 
sequentially in time order. To prove this expression, we simply insert it into Eq. 
(5.87). Written in this form, however, this expression is not very useful. We will 
find it much more convenient to power expand the exponential in a Taylor series, 
so we have: 


foe) “\n t t 
Ut) = 1+ — / ax f TE ee 
ie ee —00 


x / dx, T (Hj) Hi(%)°*» Hi%p)1_—-- (5.90) 


(See Exercise 10 concerning the change in the upper limits of integration.) Now 
that we have an explicit solution for U(t), let us decompose the interacting Green’s 
function. If we take the matrix element of a series of interacting ¢ fields, we will 
use the U/(t) operator to convert the entire expression to free, asymptotic fields. 
To do this, let us first choose a sequence of space-time points x/", time ordered 
such that x? > x9 > ---x?, so we can drop the time ordering operator T. 

Then we can replace all interacting fields with free fields by making the 


conversion ¢ = U~!,,U everywhere: 


(O|T (P41) P(x2) + PO%n)) [0) = (O]OC1) - + - $@n)|0) 
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(O|U-"(ty)bin(x1U (1) U — (t2) bin (22) U (hr) 


++ U(tr—1)U~ (tn) in n)U (tn)|0) 


(O|U-"(ty)in(x1)U (th, t2)Pin(X2) + ++ U(th—1, tn) Pin (Xn U (t)|0) 


(O|U~!(t)U (Et, ty)bin(X1) +++ Pin&n)U (tn, —t)U (—1))0) 


(Q|U-'()T emcrret - Pinen)U(t, -»| U(—1)|0) 
( 


where ¢ is an arbitrarily long time, much greater than t;, and —t is much less than 
t,. We will later set tf = oo. In the last line, since we have restored the time 
ordering operator T, we are allowed to move all U’s around inside the matrix 
element. We can thus combine them all into one large U(t, —t), which in turn can 
be expanded as a function of the interacting Lagrangian (defined strictly in terms 
of free, asymptotic fields). 

There is now one last step that must be performed. We still have the term 
U(—t)|0) and (0|U~'(¢) to eliminate in the limit as t — oo. In general, since we 
assume that the vacuum is stable for this theory, we know that the vacuum is an 
eigenstate of the U operator, up to some phase. 

Since U(t) = U(t, —oo), we can set: 


t 


=i 


) 


631) 


ar Hi(t')) Jece 


Ur en +++ Pin(Xn) Exp @ / 


jim U(—1)|0) = |0) (5.92) 
where we have taken the limit so that U(—oo, —oo) = 1. However, for the 
other state (0|U—'(t), we must be a bit more careful, since the limit gives us 
U-'(o0) = U(—o0, +00). Since the vacuum is stable, this means that the vacuum 


at t = —oo remains the vacuum at ¢ = +00, modulo a possible phase A. We thus 
find: 


Jim, (o|U~1(t) = 2(0| (5.93) 
By hitting the equation with |0), the phase A is equal to: 


A 


Jim (0|U-()|0) 


Jim (0|U(0)|0) 
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- (0 


The phase 4 thus gives us the contribution of vacuum-to-vacuum graphs (i.e., 
graphs without any external legs). Putting everything together, we now have an 
expression for the Green’s function defined with interacting fields: 


‘ ol | 
T exp (i / d*< Zi(in)) o (5.94) 


G(X}, X2, °° + Xn) = (O|T [P01 (x2) - +» PC%n)] |0) (5.95) 
This interacting Green’s function, written in terms of free fields, becomes: 


G43 6 -a) 


(O11) aCtnden fifd*xZilbe)]10) 
= (5.96) 
(OT exp {i f ax F; tdin(x)1 }I0) 


Using the formalism that we have constructed, we can rewrite the previous 
matrix element entirely in terms of the asymptotic “in” fields by a power expansion 
of the exponential: 


COjeeas:-°.x%_) = — i d*yy---d* ym(0|T | OC) 6002) 


m=0 ° =e) 


+ $n) Hi WHiO2):+* HiYm)|I0e 6.97) 


where the subscript c refers to connected diagrams only. (From now on, we will 
drop the “in” subscript on all fields. However, we must remind the reader that we 
have made the transition from the Heisenberg picture to this new picture where 
all fields are free.) 

The next step is to actually evaluate the time ordered product of an arbitrary 
number of free fields. To do this, we appeal to Wick’s theorem. 


5.6 Wick’s Theorem 


We begin our discussion by defining the “normal ordering” of operators. In 
general, unlike the classical situation, the product of two quantum fields taken at 
the same point is singular: 


Te A ae (5.98) 
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(In fact, in Chapter 14, we will investigate precisely how divergent this expression 
is.) Unfortunately, our action consists of fields multiplied at the same point; so 
the transition to the quantum theory is actually slightly ambiguous. To render our 
expressions rigorous, we recall the normal ordered product of two fields introduced 
in Eq. (3.27). Let us decompose the scalar field into its creation operators [labeled 
by a — sign, and annihilation operators, labeled by a+ sign: o(x) = @*(x)+(x) J. 
Then the normal ordered product of these fields simply rearranges the creation and 
annihilation parts such that the creation operators always appear on the left, and 
annihilation operators always appear on the right. The normal ordered product is 
defined as: 


P(x)O(y): = Gx)" O*(y) + OR) $0) + OO) O°) 
d(x) @ (y) (5.99) 


+ 


(The normal ordered product of two fields is no longer a local object, since we 
have split up the components within the fields and reshuffled them.) 

One consequence of normal ordering is that the vacuum expectation value of 
any normal ordered product vanishes, since annihilation operators always appear 
on the right. It vanishes because: 


¢*|0) = (0|\@- =0 (5.100) 


Similarly, one can define normal ordered products for more complicated products 
of fields. Now we can proceed to find the relationship between normal ordered 
products and time ordered products. It is easy to prove the following identity, or 
Wick’s theorem for two fields: 


T [O(%1)b(x2)] =: (x1) b(x2) : + c number (5.101) 


The only difference between the time ordered and normal ordered products is 
that we have reshuffled the various annihilation and creation parts of the fields. 
Each time we commute parts of the fields past each other, we pick up c-number 
expressions. To find what this c-number expression is, we now simply take the 
vacuum expectation value of both sides. Since the vacuum expectation value of 
normal ordered products is zero, we find: 


T (O(1)O(42)] =: 6011) OCr2) : + (O/T [$1 O(x2)) |0) (5.102) 
If we have three fields, then Wick’s theorem® reads: 


T (O(%1)O(%2)O(03)) =: $011) PO2)O(3) : 
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+ (O|T [P(%1)b(@2)] |0) P(x3) 
+ (O|T [b(x2)(x3)] 0) 6x1) 
+ (OT [6(%3)(x1)] |0)6G2) (5.103) 


The last three terms can be succinctly summarized by the sum aes which 
means summing over all permutations of x,, x2, x3. To prove this expression with 
three scalar fields, we take the original identity with just two scalar fields, and 
multiply both sides of the equation on the right by @(x3), which has the earliest 
time. Then we merge $(x3) into the normal ordered product. Merging *(x3) 
into the normal ordered product is easy, since the annihilation operators are on the 
right, anyway. However, merging @~ (x3) into the normal ordered product is more 
difficult, since @~ (x3) must move past @*(x;) and @*(x2). Each time it moves 
past one of these terms, we pick up a c-number expression, which is equal to the 
time ordered product; that is: 


(0|P(x2)(x3)|0) 
(O|T [@(x2)6(x3)] |0) (5.104) 


(0|* (x2) ~ (x3)|0) 


In this way, we pick up all the terms in the Wick identity. Likewise, for four fields, 
we have: 


(O|T [6(x1) - + 6(x4)] 0) =: 601) += O(xa) : 
+S“ (O|T [6(41)(x2)] 10) + Crs) (4) : 
perm 


+ S°(O\T [6(x1)h(%2)] |0) (O|T [63)6(%4)] |0) (5.105) 
perm 


By now, it should be obvious that the time ordered product of n fields can be 
written in terms of sum of normal ordered products. For the general n-point case 
(n even), Wick’s theorem reads: 


T [O(%1)b(x2) = + + P(in)] = O11) (2) ++ Pn) | 
+ SY (O\T [h(e1)bC2)] {0) + $03) ++ 6Gn) : 


perm 


+ > OlT [H(x1)b(x2)] 10) (O|T [b(23)G(%4)]} 0): O05) ++ OCXn) : 
perm 


+ S > (O|T (P(x1)b(2)) |0) --- (O|T (PCn=1) Cn) |0) (5.106) 


perm 
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For n odd, the last line reads: 


Y > (O|T [6(41)6(22)] |0) --- COLT [$Cin-2)C%n—=1)] 10) Cn) (5.107) 


perm 


These formulas are proved by induction on x. If we assume that they hold for 
n — 1, then we multiply the entire formula on the right by #(x,), which has the 
earliest time. Then, by merging $(x,,) with the rest of the products, we find that the 
formula now holds for n. The proof is straightforward, and runs the same as for 
the case described earlier. Merging #*(x,) poses no problem, since annihilation 
operators are on the right anyway. However, each time @ (x,,) moves past $*(x;), 
we pick up a commutator, which is equal to the vacuum expectation of the time 
ordered product of the two fields. In this way, it is easy to show that we pick up 
the nth Wick’s theorem. 

If we take the vacuum expectation value of both sides of the equation, then 
Wick’s theorem for vacuum expectation values reads: 


(O|T[b@1)bG2) ++ (n))]0) = S2O|T (661) $022)1 0) 


perm 


-++ (O|T [6Gn—1)6@n)] 10) (5.108) 


The generalization to fermionic fields is also straightforward. [The only com- 
plication is that we pick up extra minus signs because of the anti-commutation 
properties of fermionic fields. For more complicated products, we must always 
insert (—1) whenever fermion fields move past each other.] We find: 


T [v)W(y)] = HOON) : +(0|T [V4] 0) (5.109) 


Now insert Eqs. (5.97) and (5.108) into Eq. (5.73). This gives us a complete 
reduction of the S matrix (written as a function of interacting fields) in terms of 
Green’s functions of free fields, as desired. In the last step, we can eliminate the 
(a7 +m?) factors appearing in Eq. (5.73), because they act on two-point Green’s 
functions and become delta functions: 


(07 +m’), (O/T P(x)(y)|0) —id“(x — y) 


GJ —m)(O|TY@)~(Q)|0) = id" — y) (5.110) 
In this fashion, we have how completely reduced the S matrix element into sums 
of products of two-point Green’s functions and certain vertex elements. Although 
Wick’s theorem seems a bit tedious, in actual practice the decomposition proceeds 
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rapidly. To see this, we will analyze the four-point function taken to first order in 
with an interaction given by —A¢*/4!. We want to expand: 


id 
Gi -X4) = ai [evr [o(x1) = b(x4)(y)"] [Oye 


4 
Ce / d*y [JIT (6:)60)1[0) + -- 
i=] 


4 
ia) f aty [] des - y+ (5.111) 
i=] 


where we have used Wick’s theorem. The 4! term has disappeared, because there 
are 4! ways in which four external legs at x; can be connected to the four fields 
contained within ¢7. 

Another example of this decomposition is given by the four-point function 
taken to second order: 


= yd 
GG.“ 5%) = (=*) af din f ya(oir{oen)---de0 


x (601) 11602)'I fo) (5.112) 


The expansion, via Wick’s theorem, is straightforward: 


Se 
Gon) = SP faty fatn{aror— yor as 
+ArOr—yWIArOi- ys} 6.113) 


where: 
Ag = Arle — yi) Are — y)Ar(xs — y2)Ar(%4 — y2) 


+ Ar(xy — yi Ar(x3 — yi)Ar(X2 — y2)Ar(%4 — yo) 
+ Ar(x — yi Ar(x%4 — yi) Ar(x2 — y2)Ar(3 — yo) (5.114) 


Ap = Art — yiArGe — y2)Ar(%s — y2)Ar(%4 — y2) 
+Ap(x1 — y2)Ar(%2 — y)Ar(x3 — y2)Ar(%4 — y2) 
+ Ar(x) — ya)Ar(%2 — y2)Ar(x3 — yi) Ar(%4 — y2) 
+ Ap(x1 — y2)Ar(%2 — y2)Ar(x3 — y1)ArO4 — y2) G.115) 


These are shown graphically in Figure 5.4. 
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Figure 5.4. The Feynman diagrams corresponding to the Wick decomposition of ¢* theory 
to second order. 


Finally, we will convert from x-space to p-space by using Eqs. (3.50) and 
(3.135). When we perform the x integrations, we obtain a delta function at 
each vertex, which represents the conservation of momentum. When all x-space 
integrations are performed, we are left with one delta function representing overall 
momentum conservation. We are also left with a momentum integration for each 
internal loop. Then all Feynman’s rules can be represented entirely in momentum 
space. 


5.7 Feynman’s Rules 


From this, we can extract graphical rules by Feynman rules, by which we can 
almost by inspection construct Green’s functions of arbitrary complexity. With an 
interaction Lagrangian given by —A¢4/4!, .%,; appearing in Eq. (5.26) can be 
calculated as follows: 


1. Draw all possible connected, topologically distinct diagrams, including loops, 
with n external legs. Ignore vacuum-to-vacuum graphs. 


2. For each internal line, associate a propagator given by: 


Sos Se iAr(p)= ee (5.116) 


—m?2+ie 


3. For each vertex, associate the factor —iA. 
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4. For each internal momentum corresponding to an internal loop, associate an 
integration factor: 


d*p 
(ay 


(5.117) 


5. Divide each graph by an overall symmetry factor S corresponding to the 
number of ways one can permute the internal lines and vertices, leaving the 
external lines fixed. 


6. Momentum is conserved at each vertex. 


The symmetry factor S is easily calculated. For the four-point function given 
above, the !/4! coming from the interaction Lagrangian cancels the 4! ways in 
which the four external lines can be paired off with the four scalar fields appearing 
in @*, so S = 1. Now consider the connected two-point diagram at second order, 
which is a double-loop diagram (which has the topology of the symbol @). There 
are 4 ways in which each external leg can be connected to each vertex. There are 
3 x 2 ways in which the internal vertices can be paired off. So this gives us a 
factor of 1/S = (1/4!)(1/4!) x (4 x 4) x @ x 2) = 1/3!, so S =6. 

For QED, the Feynman’s rules are only a bit more complicated. The interaction 
Hamiltonian becomes: 


FH, = —iey" WA, (5.118) 


As before, the power expansion of the interacting Lagrangian will pull down 
various factors of .#%,. Then we use Wick’s theorem to pair off the various 
fermion and vector meson lines to form propagators and vertices. 

There are only a few differences that we must note. First, when contracting 
over an internal fermion loop, we must flip one spinor past the others to perform 
the trace and Wick decomposition. This means that there must be an extra —1 
factor inserted into all fermion loop integrations. 

Second, various vector meson propagators in different gauges may be used, 
but all the terms proportional to p, or p, vanish because of gauge invariance 
(which will be discussed more in detail later). 

Thus, the Feynman’s rules for QED become: 


1. For each internal fermion line, associate a propagator given by: 


i a i(f +m) 


iS = —_—__—__ = =, 


(5.119) 
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2. For each internal photon line, associate a propagator: 


— — $80 SPL SSIV 

(5.120) 

3. At each vertex, place a factor of: 
— 1eYy 

@al21) 
4. Insert an additional factor of —! for each closed fermion loop. 
5. For each internal loop, integrate over: 

d*q 
Pes 
ny ( ) 


6. A relative factor —1 appears between graphs that differ from each other by an 
interchange of two identical external fermion lines. 


7. Internal fermion lines appear with arrows in both clockwise and counter- 
clockwise directions. However, diagrams that are topologically equivalent 
are counted only once. 


8. External electron and positron lines entering a graph appear with factors 
u(p, S) and v(p, s), respectively. External electron and positron lines leaving 
a graph appear with factors u(p, s) and v(p, s), respectively. The direction of 
the positron lines is taken to be opposite of the electron lines, so that incoming 
positrons have momenta leaving the diagram. 


Likewise, we can calculate Feynman’s rules for any of the actions that we 
have investigated earlier. 
For example, for charged scalar electrodynamics, with the additional term in 
the Lagrangian: 
% = D,¢'D'¢ — mold (5.123) 
one has the following interaction Hamiltonian: 


H, = ~ied' (9, — F,) GAM — PAZ G14 (5.124) 


The Feynman rules are as follows: 


5.8. 


for deriving the Feynman rules for any quantum field theory. 


Exercises - 


. For each scalar-scalar-vector vertex, insert the factor: 


—ie(p+p)y 


where p and p’ are the momenta for the scalar line. 


. Insert a factor of: 


ee 
2ie* guy Be 


for each “seagull” graph. 


lines. 
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(5.125) 


(5.126) 


. Insert an additional factor of 1/2 for each closed loop with only two photon 


In summary, we have seen that, historically, there were two ways in which 
to quantize QED. The first method, pioneered by Feynman, was the propagator 
approach, which was simple, pictorial, but not very rigorous. The second was the 
more conventional operator approach of Schwinger and Tomonaga. In this chapter, 
we have presented the LSZ approach, which is perhaps the most convenient method 


With Feynman rules, one can almost, by inspection, write down the perturba- 
tion expansion for any quantum field theory. In the next chapter, we will use these 
rules to calculate higher-order interactions in QED. 


5.8 Exercises 


1. Set up the reduction formulas for a massless and massive vector meson. 


Derive the counterpart of Eqs. (5.73) and (5.77). 


2. Write down the Feynman rules for a massive pseudoscalar field interacting 


with a Dirac electron via the interaction wysy¢. 


3. Write down the Feynman rules for a massive pseudovector field interacting 


with the Dirac field via the term Wysy" WA, 


4. In the “old-fashioned” noncovariant canonical approach to QED in the 
Coulomb gauge, one derived the scattering matrix by solving the Schrodinger 
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equation with a second-quantized Hamiltonian, in which the interactions were 
ordered in time. Draw the complete set of noncovariant diagrams that are nec- 
essary to describe electron—electron scattering in QED to the one-loop level. 
Sketch how several noncovariant diagrams can be summed to produce a single 
covariant Feynman diagram. 


5. Prove: 
ee 1 }42°°* {4M 0192""-ON—M pL fL2°- soa 
(N = My! € yy v2-+-Vyo102"-ON—M = i aa Gif27) 
where: 
oe ce = Cee Oe (5.128) 


6. Show the equivalence of the two expressions for the flux J in Eqs. (5.24) and 
(5.25) and show the equality only holds in collinear frames. [Hint: square 
Eq. (5.25), and then expand the product of two e“’® tensors in terms of delta 
functions. |} 


7. In Compton scattering, a photon scatters off an electron. Show that the relative 
velocity |v; — v2| appearing in the cross-section formula can exceed the speed 
of light. Is this a violation of relativity? Why or why not? 


8. Prove Furry’s theorem’, which states that a Feynman locp diagram containing 
an odd number of external photon lines vanishes. (Hint: Show that a fermion 
loop with an odd number of legs cancels against another fermion loop with 
the arrows reversed, or use the fact that QED is invariant under charge con- 
jugation. The fields transform as: y — w° = Cy’ and A, — —A,. Show 
that these diagrams are odd under C and hence not allowed.) 


9. The factor 4 in Eq. (5.94) contains Feynman graphs with no external legs. 
For QED, draw all such diagrams up to the second-loop level. Do not solve. 


10. If we power expand Eq. (5.89) in a Taylor series, we find: 
co t ty Gren 
U(t, t’) = 1+ cir f dt i dn. f dt, 
=I t’ t’ iv 
x T (A(t) +> F(t) (5.129) 


Prove this, and then show that it equals Eq. (5.90). (Hint: take the lowest 
order, and on a graph, draw the integration regions for ¢; and t2. Show that 
the identity holds for this case, and then generalize to the arbitrary case.) 
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ie 


12 


13% 


Prove that the time-ordered product in Eq. (5.73) always emerges when we 
make the LSZ reduction of more than one field from the asymptotic states. 


Let us define the following Green’s function: 
A'(x, x’) = —i(0|[O(x), (110) (5.130) 


Insert a complete set of intermediate states within the commutator: 


L= SR cle / d*q5*(Pn — 9) (5.131) 
Prove that the commutator becomes: 
A’(x —x')= [ - do* p(o7)A(x — x’, a) (5.132) 
where: 
pq) = p(q°)6(@o) 


is 


(27) ¥° 54(pn — 4)|(0|6O)|n) (5.133) 


This is the Kallén—Lehmann spectral representation®’. [Since o appears in 
this formula as a mass, this formula states that the complete, interacting value 
of A’ is equal to the integral over all possible free A’ with arbitrary mass o, 
weighted by the unknown function p(c7).] 


Let us separate out the one-particle contribution to the smeared average. Then 
we find: 


Al(x — x‘) = ZAC — x'sm) +f do” p(o*)A(x — x'30) — (5.134) 


m 


Also, m? is the lowest mass squared that contributes to the continuum above 
the one-particle contribution. For example, for pions, m? = 4m2. Take the 
time derivative of both sides, reducing the Green’s functions to delta functions. 
Then show: 


1=Z+ / p(o*) do” (5.135) 


m2 
which implies: 


Oe 2 1 (5.136) 
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When Z = 1, this means that the p function is zero, so the theory has collapsed 
into a free theory. For an interacting theory, we must have Z < 1. 


14. Renormalization constants are usually thought to be infinite quantities, yet we 
have just shown that Z is less than one. Is there a contradiction? 


15. For ¢? theory, prove that the symmetry factor S equals: 


Cae [l 7 (5.137) 


PLB esse 


where a, equals the number of pairs of vertices that are connected by n 
identical lines, b is the number of lines that connect a vertex with itself, and 
c is the number of permutations of vertices that leave the diagram invariant 
when the external lines are fixed. What is S for QED? 


Chapter 6 


Scattering Processes 
and the S Matrix 


I was sort of half-dreaming, like a kid would ... that it would be funny if 
these funny pictures turned out to be useful, because the damned Physical 
Review would be full of these odd-looking things. And that turned out to 


be true. 
—R. Feynman 


6.1 Compton Effect 


Now that we have derived the Feynman rules for various quantum field theories, 
the next step is to calculate cross sections for elementary processes involving 
photons, electrons, and antielectrons. At the lowest order, these cross sections 
reproduce classical results found with earlier methods. However, the full power of 
the quantum field theory will be seen at higher orders, where we calculate radiative 
corrections to the hydrogen atom that have been verified to great accuracy. In the 
process, we will solve the problem of the electron self-energy, which completely 
eluded earlier, classical attempts by Lorentz and others. 

At the end of this chapter, we will also investigate the § matrix itself. Rather 
than appeal to perturbation theory and summing Feynman diagrams, we will 
impose mathematical constraints directly on the S matrix, like unitarity and an- 
alyticity, to obtain nontrivial constraints on z-nucleon scattering. These results 
hold without ever appealing to any perturbative power expansion. 

The material covered in this chapter is fairly standard, and the reader is urged to 
consult other excellent texts for other details, such as Bjorken and Drell, Itzykson 
and Zuber, and Mandl and Shaw. 

To begin our discussion, we will divide Feynman diagrams into two types, 
“trees” and “loops,” on the basis of their topology. Loop diagrams, as their name 
suggests, have closed loops in them. Tree diagrams have no loops; that is, they 
only have branches. In a scattering process, we will see that the sum over tree 
diagrams is finite and reproduces the classical result. The loop diagrams, by 
contrast, are usually divergent and are purely quantum-mechanical effects. 
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We start by analyzing the lowest-order terms in the scattering matrix for four 
particles or fields. To this order, we find only tree diagrams and no loops. Thus, 
we should be able to reproduce generalizations of classical and nonrelativistic 
physics. (Interestingly enough, we will find that negative energy states cannot 
be omitted in these calculations, even in the nonrelativistic limit. Although these 
negative energy states are purely a byproduct of relativity, if we drop them, then 
we will fail to reproduce the classical and nonrelativistic results.) 

In this chart, we will summarize the scattering processes that we will analyze 
in the first part of this chapter: 


Compton scattering: e~+ty—-e +y 


Pair annihilation: e te -yty 
Mller scattering: e +e -e +e 
Bhabha scattering: e t+et*—e +e* 
Bremsstrahlung: e +N-e +N+y 


Pair creation: y+ty—e tet 


There is a reason for writing these scattering processes in this particular order. 
If we take the Feynman diagrams for Compton scattering and rotate them by 90 
degrees, we find that they turn into the Feynman diagrams for pair annihilation. 
This is called the substitution rule, where we take the process: 


1+2—>3+4 (6.1) 
and convert it into: 


1+3—42+4 (6.2) 


Using the substitution rule, we can group these scattering processes into pairs: 


Compton effect «+ Pair annihilation 


Mller scattering <> Bhabha scattering 


Bremsstrahlung << Pair creation 


There are several advantages to using this symmetry. At the superficial level, 
this means that we can, almost by inspection, convert the scattering amplitude 
of one process into that of the other, thereby saving a considerable amount of 
time. At a deeper level, it signals the fact that the S matrix obeys a new kind 
of symmetry, called crossing symmetry. If we treat the S matrix as an analytic 
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ke P.. §, iS PS, 


k,€ P,, 8, k,€ PS; 


Figure 6.1. Compton scattering: a photon of momentum & scatters elastically off an 
electron of momentum p;. 


function of the energy variables, then crossing symmetry relates different analytic 
regions of the § matrix to each other in a non-trivial way. 

To begin, the first process we will examine is Compton scattering, which occurs 
when an electron and a photon collide and scatter elastically. Historically, this 
process was crucial in confirming that electromagnetic radiation had particlelike 
properties, that is, that the photon was acting like a particle in colliding with the 
electron. (We will then, using the substitution rule, derive the Feynman amplitude 
for pair annihilation.) 

We will assume that the electron has momentum p; before the collision and p 
afterwards. The photon has momentum k before and k’ afterwards. The reaction 
can be represented symbolically as: 


y(k) + e(pi) > vk’) + e(ps) (6.3) 
By energy-momentum conservation, we also have: 
k+p; =k' + py (6.4) 


Compton scattering, to lowest order, is shown in Figure 6.1. 
We normalize the wave function of the photon by: 


AS €en = +e) (6.5) 


1 
V2kV 
To lowest order, the § matrix is: 


2 
a <3 2m)*8(py +k’ — pi — k) 
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ic aa (e f) i Ci) 
Ep PP" \ fae ie km /2k 


Gi) § Rr i 
+ =o SO) [Futons (6.6) 


The differential cross section is found in several steps. First, we square the 
S matrix, which gives us a divergent result. We divide by the singular quantity 
(27 )*5(0) and obtain the rate of transitions. We divide by the flux |v|/V, divide 
by the number of particles per unit volume 1/V, multiply by the phase factor 
for outgoing particles [V?/(27)°] d? py d°k’. This give us the differential cross 
section: 


[Spil?_ 1 V2d3 ppd?k’ 
(27)*6(0) |v| (2) 


~ ay E; xg | DlatwP 


do 


spins 
3 3,0 
Mon ak — pe — yd Pe Ek 
xO'(p; +k — pp —*&) E; 2k (6.7) 
where: 
f t 
66h of - 
~ Qpi-k Tig kl 


To reduce out the spins, we will once again use the convenient formula given 


in Eq. (5.54): 


Yay. s Pup, sp? =Te (PASM yortyXL2™) — 69) 


spins 2m 


Although this calculation looks formidable, we can perform the trace of up to eight 
Dirac matrices by reducing it to the trace of six, and then four Dirac matrices, etc. 
We will use the formula: 


Tr (Ki Ko e-++ Won) = ky koTr (K3--+ bon) — kn ks Tr (he has +> dn) 
+:-+ +k, + ko,Tr( ko °°: Kon—1) (6.10) 


The problem simplifies enormously because we can eliminate entire groups of 
terms every time certain dot products appear, since: 


Pak? =e-k=e'-k'=0 (6.11) 
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We can also simplify the calculation by using: _ 

e“=e*=-]; pe =e (6.12) 
In short, each trace consists of collecting the complete set of all possible pairs of 


dot products of vectors, most of which vanish. Dividing the factors into smaller 
pieces, we now find: 


4 
A (6.13) 
i=] 


spins 
where: 
Tl = Trl’ ¢kGitm) kK ¢ Vs +m)| 
= 2p)-kTr (¢ ¢ kK ¢ ¢’ y;) 
= 2p)-kTr (¢’ Kk ¢ ;) 
= 8p; -k[2(e’-k)* +k’: pj] (6.14) 
and: 


T, = Tr(P ¢ Ki +m) ¥ ¢ <p; +m) 
= 2k-pTr (Kk f 6 ¢ 6 pi) + 8k - ek: p; — 8k - ep; -k 
= 8k- pk’ py[2(e'-€) — 1] + 8(k’ - €)k- pj — Bk’) - p; 
(6.15) 
We also have 73 equal to 77, and 7, can be obtained from 73 if we make the 
substitution: (€,k) — (e’, —k’). Since the calculation is Lorentz invariant, we 
can always take a specific Lorentz frame. We lose no generality by letting the 
electron be at rest, and let the incoming photon lie along the z axis. Let the 
outgoing photon scatter within the y — z plane, making an angle @ with the z axis 
(Fig. 6.2). 
Then the specific parametrization is given by: 
Pi,p = (m, 0, 0, 0) 
k, = k(1,0,0,1) 
k’, = k’(1,0,sin@, cos 6) 
Pru = (E,0,—k'sin@, k — k’ cos 0) (6.16) 
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k 


Figure 6.2. Compton scattering in the laboratory frame, where the electron is at rest. 


It is important to notice that the only independent variables in this scattering are 
k and @. All other variables can be expressed in terms of these two variables. For 
example, we can solve for k’ and E in terms of the independent variables k and 0: 


; k 
ae 1+(k/m)(1 — cos @) 
E = m+k-—k’ (6.17) 


Adding all four contributions, we now have: 


1 a 
Me ja(py, sp) u(pi, si)? = ae & ar + 4(e'-€)? — 2) (6.18) 


k 
spins 


We now must integrate over the momenta of the outgoing photon k’ and electron 
pr. Since the only independent variables in the problem are given by k and 6, 
all integrations are easy, except for the Jacobian, which arises when we change 
variables and integrate over Dirac delta functions. Thus, the integration over d? p f 
is trivial because of momentum conservation; it simply sets the momenta to be the 
values given above. That leaves one complication, the integration over the time 
components dp ¢o. However, this integration can be rewritten in a simple fashion: 


| ae 
aE; | er d(p¢ — m°)O(p zo) (6.19) 


The integration over po in the integral just sets its value to be the on-shell value. 
Finally, this last delta function can be removed because of the integration over k’. 
The only tricky part is to extract from this last integration the measure when we 
integrate over k’. 
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This last delta function can be written as: 


Il 


5 ([k + p; — k’P — m’) 5 (2m(k — k’) — 2kk'(1 — cos 6)) 


_ 6(k’ — k’(k)) 
= Sone (6.20) 


where k’(k) is the value given in Eq. (6.17). Putting all integration factors together, 
we now have: 


@ 72 
ODS Ek pp eed kh Sees (6.21) 
2E; 


~ (2mk/k’) 


Inserting all expressions into the cross section, we obtain the Klein—Nishina 
formula’: 


do a2 (k'\’ (kk 
de (F) (f amen y = 2) (6.22) 


If we take the low-energy limit k — 0, then the Klein—Nishina formula reduces 
to the Thompson scattering formula: 


dew Ka 
—~ = —y(e-e€ 6.23 
ees ae (6.23) 

If the initial and final photon are unpolarized, we can average over the initial 
and final polarizations € and «’. In the particular parametrization that we have 
chosen for our momenta, we can choose our polarizations € and €’, such that they 
are purely transverse and perpendicular to the momenta p; and p, respectively: 


e) = (0,1,0,0) 
«2 = (0,0,1,0) 
en w= (0) 10:0) 
e'2 = (0,0, cos@, — sin) (6.24) 


It is easy to check that these polarization vectors satisfy all the required properties. 
Then the sum over these polarization vectors is easy to perform: 


Vea) %- e DP = 140876 (6.25) 


spins fin} 


170 Scattering Processes and the S Matrix 


The averaged cross section is given by: 


do a? (h\* (Kok 5 

—| ==—75(-— —+— = 0 6.26 

dQlav 2m? @) eg ee 
The integral over 6 is straightforward. Let us define z = cos 6. The integration 

yields: 


: ma - 1 
oe J Naa 


1 1 —2z? ) 
* [a—2) ead —oP 


= (==) (3/4) E ng Ge — log(1 +2a)) 


3m? a 1+2a 

log(1 + 2a) 1+3a 

_ 6.27 
, (1. +2ay 2) 


where a = k/m. For small energies, this reduces to the usual Thompson total 
cross section: 


8202 


Gur = 0-665 x 107° cm’ (6.28) 
m 


lim 0 = OThompson = 
k—0 


For high energies, the logarithm starts to dominate the cross section: 


ma? 2k 1 mk 
= = le 6.2 
lim o a ce +5+0(4 n=) | (6.29) 


6.2. Pair Annihilation 


The Feynman graphs for pair annihilation of an electron and position into two 
gamma rays is shown in Figure 6.3. Pair annihilation is represented by the 
process: 


e (pi) +e*(p2) > y(ki) + y(ka) (6.30) 


However, notice that we can obtain this diagram if we simply rotate the diagram 
for Compton scattering in Figure 6.1. Thus, by a subtle redefinition of the various 
momenta, we should be able to convert the Compton scattering amplitude, which 
we have just calculated, into the amplitude for pair annihilation. This is the 
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k,€, k,€, k,e nm 


p 
: P, P, P, 


Figure 6.3. Pair annihilation: an electron of momentum p; annihilates with a positron of 
momentum p2 into two photons. 


substitution rule to which we referred earlier. For example, the 5 matrix now 
yields, to lowest order: 


il 
Spi = —e?(2r)'6*(ky +k — pi — p2) ——— 
f © (27) 0 (ki + ko — pi — po) ak 
m 1 1 m 
x .|/—v(p,8 —_——_——¢,+ ¢————_ —u(pi, $1) 
\ Bt? »(a5— i, ae ei — 6) |e (Pi, 51 
(6.31) 
where we have made the substitutions: 
(k, €) = (—k, €}) 
(k’,e') — (ky, &) 
nips) — Wie sy 
u(prf, Sf) — v(p2, 52) (6.32) 


We will, as usual, take the Lorentz frame where the electron is at rest. Then our 
momenta become, as in Figure 6.4: 


Pie = (72,050,0) 

Po. = (E,0,0, |p|) 

ky, = k (1,0, sin@, cosé) 

ko, = (k2, 0, —k sin@, |p| — k; cos @) (6.33) 


There are only two independent variables in this process, |p| and 9. All other 
variables can be expressed in terms of them. For example, we can easily show 
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k 
2 


Figure 6.4. Pair annihilation in the laboratory frame, where the electron is at rest. 


that: 


m(m + E) 
m+ E — |p|cos@ 
(m+ E)(E — |p| cos 6) 


= 6.34 
ha m+ E — |p|cos@ oe 


ky 


We also have: 


m+E 


ky + kz 


ky +ko 


m(m + E)=k\(m + E — |p| cos 6) (6.35) 


When we contract over the Dirac matrices, the calculation proceeds just as 
before, except that we want to evaluate |vI'u|*. We have to use Eq. (3.111): 


e Lp 
> (P, s)(p, 8) = = (6.36) 
spins 
The trace becomes: 
1 kp ky 2 
Trace = 9 & 7 a — 4(€, -€2) + 2) (6.37) 


The integration over d°k, and d°k» also proceeds as before. The integrations 
over the delta functions are straightforward, except that we must be careful when 
picking up a measure term when we make a transformation on a Dirac delta 
function. When this additional measure term is inserted, the differential cross 
section becomes: 


do a?(m + E) kp ky 2 
a0 = Biplin+ E —Iplcos6y? (2 ae 2) - 
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The total cross section is obtained by summing over photon polarizations. As 
before, we can take a specific set of polarizations which are transverse to p; and 
P2: 

€, = 0, 1,010) 

e = (0,0,cos6, —sin6) 

.) = © 100) 

eS = (0,0, |p| — ki cosd, ky sind) (6.39) 


2 ema ele 
\ (ky + ko)” 
2k k 
= 2m(m + E) 
7 Qk ke 


(6.40) 


Il 
— 
| 
3 
Zs 
Fle 
+ 
ole 
Sg 


Then the sum over spins can be written as: 


2 
Ye Ly =14 f : (7 + *)| (6.41) 
tj 


The only integration left is the one over the solid angle, which leaves us with 
(y = E,/m): 


2 2+4y +1 
on eT tos (y +7? -1 i —_ (6.42) 
m(1+y) Yel Vy2—-—1 


which is a result first obtained by Dirac.? 


6.3 Meller Scattering 


Next, we investigate electron-electron scattering. To lowest order, this scattering 
amplitude contains two graphs, as in Figure 6.5. This scattering is represented by: 


e(pi) + e(p2) — e(p}) + e(p3) (6.43) 
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P. P, P, P, 


Figure 6.5. Moller scattering of two electrons with momenta p and p>. 


By a straightforward application of the formulas for differential cross sections, 
we find: 
e+m* 
io = aS pie 
[(p1 + p2)? — m*)}}/2 Pits — Pima 
d pi a Ps 
(Qn)! Qn) E; 


Ann (6.44) 


where, using a straightforward application of Feynman’s rules for these two dia- 
grams, we can compute |. 4 ;;|?: 


1 
4 il" us ifm (nto ~") 


»Poatm oho +m ] 
a (y m eam 


pPitm pi+m Potm pi+m 
eae T is v oO 
: (v wi ae in 2m 


Il 


—— fae 6.4 
i (Pi pip, — ey ang | Ce) 


Since the trace is over only four Dirac matrices, taking the trace is not hard to do: 


es (‘ - pr) + (pi: py) + 2m?(pr - ph — Pr: Pa) 
2m4 (pi — pi??? 
A (Pi + po)? +(pi- pi)? + 2m? (pr - pi — pi- pr) 
pei) 
(pi + pr)’ — 2m? p, - e) 


(p\ — P1)*(p5 — pi? ee 
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Figure 6.6. Moller scattering in the center-of-mass frame. 


The kinematics is illustrated in Figure 6.6. Without any loss of generality, we can 
choose the center-of-mass frame, where the electron momenta p, and p> lie along 
the z axis: 


Pin = (E,0,0, |p|) 

Pon = (E,0,0, —|p)) 

Pi, = (E,0, |p|sin8, |p| cos @) 

Pi, = (E,0, —|p|sin@, —|p|cos@) (6.47) 


The only independent variables are 6 and |p|. In terms of this parametrization, we 
easily find: 


Pi: po = 2E*—m? 
Pi-p, = E?(1—cos@)+m?cosé 
Pi: py, = £E?(1+cos@)—m*cosé (6.48) 


Then the entire cross section can be written in terms of these independent variables. 
We finally obtain the Mgller formula? in the center-of-mass frame: 


279 F2 — m2)2 ae oe: 4 
oe: cae 4 _ A Eee (6.49) 
dQ 4E2(E2—m?)|sin*@  sin®*@ (2E2 — m2) sin? @ 


In the relativistic limit, as E — ov, this formula reduces to: 


do a 4 2 1 
doer ( 4 728 6.50 
dQ EB (sa sin? i) ee 


For the low-energy, nonrelativistic result, we find: 


2 
do a’ 1 ( 4 3 ) (651) 


dQ m24v2\sint6 sin? 
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6.4 Bhabha Scattering 


To calculate the cross section for electron—positron scattering (Fig. 6.7), we can 
use the substitution rule. By rotating the diagram for M@ller scattering, we find 
the Feynman diagrams for Bhabha scattering.* The only substitutions we must 
make are: 


u(pi) — Uu(pr) 

u(p}) — u(p}) 

u(p2) — v(—p2) 

u(p,) — v(—p3) (6.52) 
for the process: 


epi) te (=po) = (7) eae) (6.53) 


The calculation and the traces are performed exactly as before with these 
simple substitutions. We merely quote the final result, due to Bhabha (1935): 


do _ sea (e a 8E* — m‘ p (2E? — m?)? 
dQ = 2E7\4 EE? — m?)\(1 —cos6) 2(E? — m?)?(1 — cos 6)? 


+ [16E4]7! [2e%-1 +2cos6 + cos? 6) 


+ 4E*m?(1 —cos6)(2 +cos6) + 2m‘ cos” 6}) (6.54) 
P, P, 
P' P; 
P P, 
P, P, 


Figure 6.7. Bhabha scattering of an electron off a positron. 
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In the relativistic limit, we have: 


do _ a? (see 


dQ 8E2\ sin*6/2 
1 > cos* 6/2 
+ (1 +c0s"6) — 2 aoe) (6.55) 
In the nonrelativistic limit, we find: 
dao a 1 (6.56) 


dQ m? 16v4 sin* 6/2 


6.5  Bremsstrahlung 


Bremsstrahlung is the process by which radiation is emitted from an electron as it 
moves past a nucleus (Fig. 6.8). Momentum conservation gives us: 


pPitq=k+ pry (6.57) 


Classically, one can calculate the radiation emitted by a moving charge as it 
accelerates past a proton. (Bremsstrahlung means “braking radiation.””) However, 
unlike the previous scattering processes, which agree to first order with the exper- 
imental data, we find a severe problem with this amplitude, which is the infrared 
divergence. The quantum field theory calculation, to lowest order, reproduces the 
classical result, including the unwanted infrared divergence, which has its roots 
in the classical theory. 

Although the infrared divergence first arose (in another form) in the classical 
theory, the final resolution of this problem comes when we take into account higher 


P y 


Figure 6.8. Bremsstrahlung, or the radiation emitted by an electron scattering in the 
presence of a nucleus. 
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quantum loop corrections to the scattering amplitude. The scattering matrix, using 
Feynman’s rules, is: 


oO Pe i (—i~) 
Shi = ery a) ry pep terr sn (GE V2k Bet K-~m Ta 


(—iyo) i ae) 
lq? A-K—m V2 


The differential cross section now becomes: 


) Uu(Dpi, Si) (6.58) 


Z2 6 
do = ier 2n (Es + — E;)\a pT uj? 
1 m a? pz d°*k 
6.5 
la? Ey 20x oo 
where w = ky and where: 
1 1 
[le ee (6.60 
pt¥om’ *’ y-¥—m* ! 


The trace we wish to calculate is: 


T = @?m)1+65+) 
e gist ae 0 ,oi-~ ktm itm 
: smn (+ Opp Opi ke ies) 
oPst K+m as B= K+m o\{Petm 
a es 


The traces involved in the calculation yield: 


Tr = (pp ky? DO Tri ey y+ K+ myst K +m) pp +m)] 
= 8(py-k)? ) 12 - ppm? + 2p? p} + 2pPw — pi - ps — pik) 
G 
+ 2€-pype-pk-pr+2piwk- ps — pik pj -k) (6.62) 
and: 


T, = 1\(pi — —py) (6.63) 
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Figure 6.9. Bremsstrahlung, where the emitted photon is in the z direction and the emitted 


electron is in the y — z plane. The incoming electron is in a plane rotated by angle @ from 
the y — z plane. 
and lastly: 

Ty = (pp (pi Te [y°Gi— K+ m) oi + mv” 


x(pj+ K+ m) ops +m) +(pi + —p,)| 
=! 16(py-k) “i -ky E pie» pe(pi-k — peek 


+2p; - pp — 4p? - py — 2m’) 
+(€ + pry k- pi —(€+ pik pp + pi kpp -k — mor 


+o(wpi - Pf — P) Py -k — pi: | (6.64) 


The parametrization of the momenta is a bit complicated, since the reaction 
does not take place in a plane. In Figure 6.9, we place the emitted photon momenta 
in the z direction, the emitted electron momenta p, in the y — z plane, and the 
incoming electron momenta p; in a plane that is rotated by an angle ¢@ from the 
y —z plane. 

The specific parametrization is equal to: 


~ 
= 
ll 


(1, 0, 0, 1) 
(Es 0, Pf sin 6, Pf cos ¢) 


cS 
= 
= 

I 


(E;, p; sin 9; sing, p; sin 8; cos d, p; cos 6;) (6.65) 


= 
® 
U 
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where p; = |p;| and py = |ps|. Now we must calculate the sum over transverse 
photon polarizations. Since k,, points in the z direction, we can choose: 


i 


(0, 1, 0, 0) 
(0, 0, 1, 0) (6.66) 


which satisfies all the desired properties of the polarization tensor. With this 
choice of parametrization for the momenta and polarizations, we easily find: 


t 


Die: pp? = pesin? Op; DEO» pi)” = pj sin’ 6, 
Vie pee: pi) = pips sinOs sin 6; cos (6.67) 


It is now a simple matter to collect everything together, and we now have the 
Bethe-Heitler formula’ (1934), which was first computed without using Feyn- 
man’s rules: 


Za ID dw 
= —— —dQ,dQ 
do (2x) pig’ a y e 
ye iu 
Pe sin OF 2 2 D; sin 6; 5 5 
a ee? 9) ye ies = 
(&; = py cos8y) Pa (Ej — pi sosen? tae 
; p; sin? 6; + p% sin’ 6; 
2 
(E¢ — pz cosO¢)(E; — pi sin) 
Pfs pi Sin 8; sinO; cos d 
(Es — pz cos0;)(E; — p; cos 6;) 


Now let us make the approximation that w — 0. In the soft bremsstrahlung 
limit, we find a great simplification, and the differential cross section becomes the 
one found by classical methods: 


do do e” dk eon, see 

—~ | — ~~ — — —— 6.6 

dQ a 2w(27)3 ds € ’ p-f k- e) Z 
Here the infrared divergence®—* appears for the first time. This problem was 
first correctly analyzed by Bloch and Nordsieck.* The integral d*k/w is divergent 
for small w, and therefore the amplitude for soft photon emission makes no 


sense. This is rather discouraging, and revealed the necessity of properly adding 
all quantum corrections. The resolution of this question only comes when the 
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one-loop vertex corrections are added in properly, which we will discuss later in 
this chapter. For now, it is important to understand where the divergence comes 
from and its general form. 

The infrared divergence always emerges whenever we have massless particles 
in a theory that can be emitted from an initial or final leg that is on the mass shell. 
For example, whenever we emit a soft photon of momentum k from an on-shell 
electron with momentum p, we find that the propagator just before the emission 
is given by: 

eee ~ re ~ OO (6.70) 
Because p* = m? and k is small, we find that an integration over momentum 
k inevitably produces an infrared divergence. In order to quantify this infrared 
divergence, let us perform the integration over the momentum k, separating out 
the angular part dQ from d*k. To parametrize the divergence, we will regulate 
the integral by allowing the photon to have a small but finite mass jz. (This is, of 
course, a bit delicate since we are breaking gauge invariance by having massive 
photons, but one can show that, at this order of approximation, there are no 
problems.) We will integrate k from uz to some energy E given by the sensitivity 
of the detector. Expanding out the expression in the square, the amplitude now 
becomes: 


do do a 2 2P f° Pi m? m? ) 

a (ae ree kdk 4 dQ | ———__—___ —- ——__. —- — __ 

dQ = An? ik / — Cee Ces 
(6.71) 


In our approximation, we can perform the angular integral over the last three terms 
in the large parentheses. 

Let us calculate the last two terms appearing on the right-hand side of the 
equation. We use the fact that: 


k- pz ~ E(i —cos@¢) (6.72) 


Because dQ. = 27 d(cos @), we can trivially integrate the last two terms: 


dQ =m? mf?! 1 
AE LA TR ee: 6.73 
Iz (k- py? aa if (cos?) =Reasat (Ong) 


The integral over the first term is also easy to perform. We have to use the fact 
that: 


2Py: Di 2E7(1 — B* cos@) 


7 ain, WA Se, 6.74 
(k py pi) EC — Beos8,)E(I — Bcos 6s) cio 
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We will now introduce the Feynman parameter trick, which is often used to 
evaluate Feynman integrals: 


1 : dx 
—— M, 
ab Hl [ax + b(1 — x)}* o) 


By introducing a new variable x, we are able to perform the angular integration. 
We find: 


1 
fered ~ 211 ~ pcos?) | dx 
An (k- p(k: pi) 0 


1 


Oe oe OS 
(1 — B* cos | 1 — B2 + 4B? sin?(9/2)x(1 — x) 


x / d(cos 8)(1/2) 


2[1 + $B* sin’(@/2]+ O(6*) if B<1 


(6.76) 
2log(—q?/m?) + O(m?/q7) if m*/q? «1 


Inserting this value back into the previous expression, we find that the final soft 
bremsstrahlung cross section is then given by: 


af a : = lpr eal (6.77) 


dQ \dQJy mu? | log(—q2/m2)—1+ O(m?/q?) Bw 


Although this formula agrees well with experiment at large photon momenta, this 
amplitude is clearly divergent if we let the fictitious mass of the photon yz go to 
zero. Thus, the infrared divergence occurs because we have massless photons 
present in the theory. 

We should mention that the infrared problem arose (in another form) in classi- 
cal physics, before the advent of quantum mechanics. The essential point is that, 
even at the classical level, we have the effects due to the long-range Coulomb 
field. If one were to calculate the radiation field created by a particle being ac- 
celerated by a stationary charge, one would find a similar divergence using only 
classical equations. If one tries to divide the energy by ko to calculate the number 
of photons emitted by bremsstrahlung, it turns out to be proportional to the result 
presented above. Thus, as the momentum of the emitted photon goes to zero, the 
number of emitted photons becomes infinite. (Classically, the infrared divergence 
appears in the number of emitted photons, not the emitted energy.) 

Quantum field theory gives us a novel, but rigorous, solution to the infrared 
problem, which goes to the heart of the measurement process and the quantum 
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Figure 6.10. The infrared divergence cancels if we add the contributions of two different 
physical processes. These diagrams can be added together because the resolution of any 
detector is not sensitive enough to select out just one process. 


theory. To this order of approximation, we have to add the contribution of two 
different physical processes to find the cross section of electrons scattering off 
protons or other charged particles (Fig. 6.10). 

The first diagram describes the bremsstrahlung amplitude for the emission of 
an electron and a photon. The divergence of this amplitude is classical. Second, 
we have to sum over a purely quantum-mechanical effect, the radiative one-loop 
corrections of the electron elastically colliding off the charged proton. This may 
seem strange, because we are adding the cross sections of two different physical 
processes together, one elastic and one inelastic, to cancel the infrared divergence. 
However, this makes perfect sense from the point of view of the measuring process. 

The essential point is to observe that our detectors cannot differentiate the pres- 
ence of pure electrons from the presence of electrons accompanied by sufficiently 
soft photons. This is not just a problem of having crude measuring devices. No 
matter how precise our measuring apparatus may become, it can never be perfect; 
there will always be photons with momenta sufficiently close to zero that will 
sneak past them. Therefore, from an experimental point of view, our measuring 
apparatus cannot distinguish between these two types of processes and we must 
necessarily add these two diagrams together. Fortunately, we get an exact cancel- 
lation of the infrared divergences when these two scattering amplitudes are added 
together. 

A full discussion of this cancellation, however, cannot be described until we 
discuss one-loop corrections to scattering amplitudes. Therefore, in section 6.7, 
we will prove that the bremsstrahlung amplitude, given by: 


do da\ a. E*  -—@q? 
— ~ | —]} — log — log —— 6.78 
dQ GE vs pe °F in? ne) 


must be added to the one-loop vertex correction in order to yield a convergent 
integral. We will return to this problem at that time. 
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Finally, we note that, by the substitution rule, we can show the relationship 
between bremsstrahlung and pair annihilation. Once again, by rotating the diagram 
around, we can convert bremsstrahlung into pair creation. 

This ends our discussion of the tree-level, lowest-order scattering matrix. 
Although we have had great success in reproducing and extending known classical 
results, there are immense difficulties involved in extending quantum field theory 
beyond the tree level. When loop corrections are calculated, we find that the 
integrals diverge in the ultraviolet region of momentum space. In fact, it has 
taken over a half century, involving the combined efforts of several generations 
of physicists, to resolve many of the difficulties of renormalization theory. 

We now turn to a detailed calculation of single-loop radiative corrections. 
Although the calculations are often long and tedious, involving formally divergent 
quantities, the final conclusions are simple and show that the various infinities can 
be consistently absorbed into a redefinition of the physical constants of the theory, 
such as the electric charge and electron mass. Most important, the agreement with 
experiment is astonishing. 

We will begin our discussion of radiative corrections by first examining the 
self-energy correction to the photon propagator, called the vacuum polarization 
graph. We will show that the divergence of this graph can be absorbed into a 
renormalization of the electric charge. 

Then, we will calculate the single-loop correction to the electron—photon vertex 
and show that this leads to corrections to the magnetic moment of the electron. The 
theoretical value of the anomalous magnetic moment will agree with experiment 
to one part in 10°. After that, we will show that the radiative correction to the 
vertex function is also infrared divergent. Fortunately, the sign of this infrared 
divergence is opposite the sign found in the bremsstrahlung amplitude. When 
added together, we will find that the two cancel exactly, giving us a quantum 
mechanical resolution of the infrared problem. 

And finally, we will close this chapter by analyzing the Lamb shift between 
the energy levels of the 2S,/2 and 2P/2 orbitals of the hydrogen atom. The 
calculation is rather intricate, because the hydrogen atom is a bound state, and also 
there are various contributions coming from the vertex correction, the anomalous 
magnetic moment, the self-energy of the electron, the vacuum polarization graph, 
etc. However, when all these contributions are added, we find agreement with 
experiment to within one part in 10°. 


6.6 Radiative Corrections 


The simplest higher-order radiative correction is the vacuum polarization graph, 
shown in Figure 6.11. This graph is clearly divergent. For large momenta, 
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k-q 


Figure 6.11. First-loop correction to the photon propagator, the vacuum polarization graph, 
which gives us a correction to the coupling constant and contributes to the Lamb shift. 


the Feynman propagators of the two electrons give us two powers of p in the 
denominator, while the overall integration over d*p gives us four powers of p in 
the numerator. So this graph diverges quadratically in the ultraviolet region of 
momentum space: 


I =-etr [ £% a (6.79) 
cial Ory” Pamerie” (24 =m tie 


We will perform this integration via the Pauli—Villars method,’ although the 
dimensional regularization method, which we will present in the next chapter, is 
significantly simpler. The Pauli—Villars method replaces this divergent integral 
with a convergent one by assuming that there are fictitious fermions with mass 
M in the theory with ghost couplings. At the end of the calculation, these 
fictitious particles will decouple if we take the limit as their masses tend to 
infinity. Therefore, M gives us a convenient way of cutting off the divergences of 
the self-energy correction. 

The graph then becomes modified as follows: 


i = im =e Il uv,M (6.80) 


The most convenient way in which to perform the integration is to add additional 
auxiliary variables. This allows us to reverse the order of integration. We can 
then perform the integration over the momenta, and save the integration over the 
auxiliary variables to the very end. We will use: 


i ee ene 
= [ doe 7 tHe) (6.81) 
0 


k2 — m2 +ie 


Inserting this expression for the electron propagators and performing the trace, we 


find: 
co ee) d*k 
of 2 pesos 
IIpvm = 4c | aa | dan | 


x exp {ier (k? —m? +ie) + ian[(k — gq)’ — m? + ie} 
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x [kulk — qv + kik — Du — Suk? —k-q—m*)] (6.82) 


With the insertion of these auxiliary variables, we can now perform the integration 
over d*k. First, we shift momenta and complete the square: 


p=k- — (6.83) 
1+Q2 
Then we use the fact that: 
a"? ivan) _ J 
(2n)4 1672i(a, +a)? 
d*p ip? (a) +02) i8uv 
—-— Hs = — 6.84 
/ (nyt PuPr? 327210, bo)? (om) 


Putting everything back into II,,,, we have: 


Tyvm = (Suv9? — Quqvli + Suvll2 (6.85) 
where: 
= 12 dor dorr ef (o1,02) 
se =f iP (a +0)4 — 
a i fe , Rend ena ua yeh (6186) 
"(ou +02)3 
and where: 
ope, 1&2 : 2 , 
f(a, 2) = ig ata = t€)(@] +2) (6.87) 


There is a similar expression for II,,,,4. We notice that II, diverges quadratically, 
which is bad. However, since we have carefully regularized this integral using 
the Pauli—Villars method, the integral is finite for fixed M, and we are free to 
manipulate this expression. We can then show that II, vanishes. This rather 
remarkable fact can be proved using simple scaling arguments. Consider integrals 
of the following type: 


i dx fee cup a dx pf (x) 
f(xje e 
0. « Op Jo =X 


Assume that f(x/o) = f(x)/p. After this simple rescaling, this integral equals: 


(6.88) 


p=1 


a ™ dx ef 
Op Jo =x 


=0 (6.89) 


p= 
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since all dependence on p has vanished. With a few modifications, this argument 
can be used to show that II, = 0. Thus, the vacuum polarization graph is only 
logarithmically divergent. 

To perform the integrations in IT;, we use one last identity: 


rid 
=| oP (1 eS a= (6.90) 
0 ?P p 


Inserting this into the expression for II, and rescaling a;, we find: 
i = =f i. aa day cae ee i dp 
(@; +02)" Jo — 
x8 (1 eagl 22 ) ef (a1,02) 
p 
_2i a 
= af . da, day aa 5(1 — a, — @2) 


od 
x | - exp [ip(q’aa2 — m* + ie)| (6.91) 
0 


As expected, this integral is logarithmically divergent. At this point, we now use 
the Pauli—Villars regulator, which lowers the divergence of the theory. To perform 
the tricky p integration, we use the fact that m — ayarg? is positive, so we can 
rotate the contour integral of p in the complex plane by —90 degrees. Using 
integration by parts and rescaling, we have the following identity: 


oo d fore) 
| gn ee =logpe 7? 
€ p G 


where a(m) = m? — a,a2q”. The dangerous divergence comes from the last term. 
However, since the last term is independent of a, it cancels against the same term, 
with a minus sign, coming from the Pauli—Villars contributicn. Thus, ve lave the 
identity: 


-| log ae? ap+ | logpe dp (6.92) 
a 


€ ae 


lim e (47) — e~ “)P) = — log a(m) + loga(M) 


M2 
= Slag (1- ued) log — (6.93) 


for large but finite M. We can now take the limit at « — 0 and M becomes large. 
Then: 


a ] ia oe q? 
Il, = log — —— da, ay( — a;) log 1 —a,(1 =F) as (6.94) 
ue 8 0 m 
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If we perform the last and final integration over a, we arrive at: 


1/2 
a la M1 om) 4m? ) 
= | hog 4 oe 
- ca 18 one tas ( 7 g 
qe 1/2 
xarccot (“-1) =) (6.95) 
q 


This is our final result. After a long calculation, we find a surprisingly simple 
result that has a physical interpretation. We claim that the logarithmic divergence 
can be cancelled against another logarithmic divergence coming from the bare 
electric charge eo. In fact, we will simply define the divergence of the electric 
charge so that it precisely cancels against the logarithmic divergence of I];. 

To lowest order, we find that we can add the usual photon propagator D,,, to 
the one-loop correction, leaving us with a revised propagator: 


z 2 
~ fis (1 lg 5) (6.96) 


in the limit as g* — 0. This leaves us with the usual theory, except that the photon 
propagator is multiplied by a factor: 


eA ea (6.97) 
q 
where: 


(6.98) 


Now let us absorb this divergence into the coupling constant e9. We are 
then left with the usual theory with an extra (infinite) factor Z3 multiplying each 
propagator. Since the photon propagator is connected to two electron vertices, 
with coupling e9, we can absorb Z3 into the coupling constant, so we have, to 
lowest order: 


e = /Z3e0 (6.99) 


where e is called the renormalized electric charge. Since the infinity coming 
from Z3 cancels (by construction) against the infinity coming from the bare elec- 
tric charge, the renormalized electric charge e is finite. (Other renormalization 
constants will be discussed in Chapter 7.) 

Although this works at the lowest level, it remains to be seen whether we can 
extend this procedure to all orders in perturbation theory. This will be further 
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discussed in the next chapter. Next, we turn to the calculation of the single-loop 
correction to the vertex function, which gives us the anomalous magnetic moment 
of the electron. 


6.7 Anomalous Magnetic Moment 


We recall that in Chapter 4 we derived the magnetic moment of the electron 
by analyzing the coupling of the electron to the vector potential. At the tree 
level, we know that the coupling of an electron to the photon is given by A*iiy,,u, 
which in turn gives us a gyromagnetic ratio of g = 2. However, the experimentally 
observed value differed from this predicted value by a small but important amount. 
Schwinger’s original calculation'® of the anomalous magnetic moment of the 
electron helped to establish QED as the correct theory of electrons and photons. 

To calculate the higher-order corrections to the magnetic moment of the elec- 
tron, we will use the Gordon identity: 


1 
ap’ )y,u(p) = —iai(p’) [(p + p’)y. + iouvg” | u(p) (6.100) 
2m 


(To prove this, we simply use the Dirac equation repeatedly on the left and right 
spinors, which are on-shell.) The magnetic moment of the electron comes from 
the second term uo,,g’uA". To see this, we take the Fourier transform, so 
q’ becomes @”, and the coupling becomes uo,,F"*u. The magnetic field B; 
is proportional to €;;, F J‘ so this coupling term in the rest frame now becomes 
iio; B;u, where we use the Dirac representation of the Dirac matrices. Since 0; 
is proportional to the spin of the electron, which in turn is proportional to the 
magnetic moment of the electron, the coupling becomes yz - B. This is the energy 
of a magnet with moment p in a magnetic field B. 

In this section, we will calculate the one-loop vertex correction, which gives 
us a correction to the electron—photon coupling given by (to lowest order in @): 


aa, (p+ Du a iOyvq” 
a(p')| PF P 4 (14) i. Ju (6.101) 


for the process given in Figure 6.12. 
Notice that the zo,,,g’u term is modified by the one-loop correction, so that 


the g of the electron becomes: 


g ee (6.102) 
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P 


Figure 6.12. First-loop correction to the electron vertex function, which contributes to the 
anomalous electron magnetic moment. 


Thus, QED predicts a correction to the moment of the electron. To show this, we 
will begin our calculation with the one-loop vertex correction: 


d*k — (-i) i 
(nyt 2 — erie” 


l 
x -————$_$ 
aie 


A,(p', p) = (—ie) 


‘ (6.103) 


Anticipating that the integral is infrared divergent, we have added jz, the fictitious 
mass of the photon. The integral is also divergent in the ultraviolet region, so we 
will use the Pauli—Villars cutoff method later to isolate the divergence. 
Throughout this calculation, we tacitly assume that we have sandwiched this 
vertex between two on-shell spinors, so we can use the Gordon decomposition 
and the mass-shell condition. Our goal is to write this expression in the form: 


; v 
LO 


m 


Au ~ Yu Fi(q?) + F,(q’) (6.104) 
sandwiched between u(p’) and u(p), where F; and F> are the form factors that 
measure the deviation from the simple y, vertex. We will calculate explicit forms 
for these two form factors. (We will find that F; cancels against the infrared 
divergence found in the bremsstrahlung calculation, giving us a finite result. We 
will show that F gives us a correction to the magnetic moment of the electron.) 

We begin with the Feynman parameter trick, generalizing Eq. (6.75). The 
following can be proved by induction: 


eee ee ee) 
an of [ / NP acme (6.105) 
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where: A = $°; a;z;. For our purposes, we want: 


a, = ke jie +i€ 

a = (p= =m tie 

= (ph) — mn ie (6.106) 
Therefore: 

, Ps an face N,,(k) 
Aut’. py=2ier ff dz, dz dzy5(1— 21 ~~ 2) af ) 
where: (6.107) 
Nulk) = YW (p’— K+ m) Yn Y-— K+m)y” (6.108) 


Next, we would like to perform the integration by completing the square: 


3 
A = az; = (k — p’z. — pz3)” ~ Ao 


T= 
he = ea a q’7973 — i€ 


—[(p'P — m*P'z2(1 — 22) — (p* — m*)zx(1 — 23) (6.109) 
This allows us to make the shift in integration: 
k—k+ p'22 + pz3 (6.110) 


Therefore, after the shift, we have: 


ie? 


(27)* 


an N,(k + p’z2 + pz3) 
4, Vu 
«| oa asio ae 


1 
Nn (p', P) i dz, dz» dz é(1 St Sy (me NA Me Z3) 


—1e.9) 


By power counting, the integral diverges. This is why we must subtract off the 
contribution of the Pauli—Villars field, which has mass A. Let us expand: 


N,(k + p'22 + pz3) = —k° yy, + 2ky K+ Aplk) + Ny(p'z2 + pz3) (6.112) 


where A,,(k) is linear in the k, variable. (This term can be dropped, since its 
integral over d*k vanishes.) 
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Therefore, the leading divergence behaves like: 


[ae SS... i (6.113) 
(k2— Ao) (2 — Aa)? 


where Aq represents the Pauli—Villars contribution, where jz is replaced by A. 
With this insertion, the integral converges for finite but large A. To perform this 
integration, we must do an analytic continuation of the previous equation. We 
know that (see Appendix): 


eke in? 
d§k—** _, = ——__T (a — 3) yy 6.114 
/ (k2- A)" 2F(@)(-A)*~> sei i 


Now let us take the limit at a — 3. This expression, of course, diverges, but 
the Pauli—Villars term subtracts off the divergence. If we let a — 3 = €, then we 
have: 


ke Ke 
lim | d4*k { ——— - ————-—“~ 
a3 (k? = Ao) ( = Aa) 


_ ee (eae 
= lim iz ING) (5 <=) 


= lim oe (ef log(—Ao) __ ef ae) 
€«—0 € 


z,A2 
5 in? og (3 ) (6.115) 


where we drop terms like A~”. Because this expression is sandwiched between 
two on-shell spinors, we can also reduce the term: 


N,(p'z2 + pz3) = —Yy, [2m7(1 — 421 +27) + 2g7(1 — z2)(1 — z3)] 
—2mz\(1 — 21) (4, Yu | (6.116) 


At this point, all integrals can be evaluated. Putting everything together, and 
dropping all terms of order A~! or less, we now have: 


a 2 z,A2 
Auto! y= 5 f dz, dz2 dz; 81-21-29 ~ 2) [Ioe( = ) ms 
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m(1 — 47, +27) +q°(1 — 2z2)(1 — 23) i lowed” 24(1 — 2) 


Ju 
Yu Ao Ao 


|. 117) 
Now let us compare this expression with the form factors F, and F2 appearing 
in Eq. (6.104). It is easy to read off: 


a oo 
F\(q*) = 1+ f dz, dz dz3 6(1 — z, — 22 — 23) 
20 0 


ee 2 7 = 
\, (- ne: m“(1 — 42, a (1 — z2)(1 — 23) Ke -0)) 
0 


(6.118) 
[We have deliberately subtracted off the integrand defined at g” = 0 in order to 
maintain the constraint F\(g? = 0) = 1, which preserves the correct normalization 
of the vertex function. This extra subtraction term comes from the fact that we 
have separated out the divergent logarithmic part, which is absorbed in an infinite 
constant called Z, which renormalizes the vertex function. This point will be 
discussed in more detail in Chapter 7.] 
The value of F> can similarly be read off: 


2m?z;(1 — 2) 


a fore) 
F,(q") = = f dz dz dz3 d(1 = Zi 2) ( Dy 
0 


) (6.119) 


The calculation for F,(q’) is a bit easier, since there are no ultraviolet or 
infrared divergences. Because of this, we can set uw = 0. Let us choose new 
variables: 


p-p' =m’ cosh@; q? = —4m? sinh?(9/2) = 2m7(1—cosh@) —_ (6.120) 


where q? = (p — p’)*. Then we have: 


l 2 
a 21 +23 — (z2 +23) 
Foq’?) = = if Se a TT 


Gd — 22 — 
dzz dz3 0( Z2 — Z3) 2 vi Z + 22923 coshé@ 


a : 1 
=f ue B2+(1 — By + 2B(1 — B)coshd oe 


This leaves us with the exact result: 


a 0 
F,(q*) = — 


, (6.122) 
2m sinhé 


We are especially interested in taking the limit as |q?| — 0 and 6 — 0: 


F,(0) = = (6.123) 
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This gives us the correction to the magnetic moment of the electron, as in Eq. 
(6.101): 
g—2 


go _ *  S ooomaeie (6.124) 
2 2n 


This is only a first-order calculation, yet already we are very close to the ex- 
perimental value. Since the calculation was originally performed by Schwinger, 
the calculation since has been taken to a? order (where there are 72 Feynman 
diagrams). The theoretical value to this order is given by: 


(Il 


1 
Ath. 5 (8 — 2) 

a a\2 a\3 
= 0.5(=) — 0.32848 (=) + 1.49 = eee (6.125) 


The final results for both the theoretical and experimental values are!’: 


Ath 0.001159652411(166) 
Gexpp = 0.001159652209(31) (6.126) 


where the estimated errors are in parentheses. 

The calculation agrees to within one part in 10° for a and to one part in 10? 
for g, which is graphic vindication of QED. (To push the calculation to the fourth 
order involves calculating 891 diagrams and 12,672 diagrams at the fifth order.) 


6.8 Infrared Divergence 


The calculation for F; is much more difficult. However, it will be very important 
in resolving the question of the infrared divergence, which we found in the earlier 
discussion of bremsstrahlung. We will find that the infrared divergence coming 
from the bremsstrahlung graph and F; cancel exactly. Although the calculation of 
F, is difficult, one can can extract useful information from the integral by taking 
the limit as 44 becomes small. Then F; integration in Eq. (6.118) splits up into 
four pieces: 


Fi=) P+: (6.127) 


i=l 


where the ellipsis represents constant terms. 
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After changing variables, each of the P; pieces can be exactly evaluated: 


Pi 


Ca. lee coshé@ 
a, dz A, a a AT... FETT a 
mw Jo 0 Z5 + 2; + 22923 cosh + (u*/m*)(1 — z2 — 23) 


2a 9/2 
= “@coth@log~ — ™ cothe f dé btanh¢ (6.128) 
m m oT 0 


where: 


tanh ¢ 


1-22, = — 
> tanh(6/2) 


(6.129) 
The others are given by: 
1 1—z2 
a Zoe 
Py = —cosh@ dz Miser ee aa 
2 oe i af e 23 + 23 + 22723 cosh@ 
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» eas 
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1 1—22 
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Thus, adding the pieces together, we find: 


Fig?) = =| (log & +1) @ cotho — 1) 
a/2 6 6 
2eotnd f dp @ tanh — = tanh = | (6.131) 
i fe 


For |q2| < m?, we find using Eqs. (6.104), (6.120), (6.122), and (6.131): 


2 
a g m a 
Vu + AC (p’, P) ~ Vu [ + Tene (ice a = 3) aP ee (4, yp] (6.132) 
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For |g?| >> m?, we find: 


+Ai(p', p)~ 1 — “tog = | log ae ~1+0(m?/q?)|} (6.133) 
Teel eee " oa: m2 


Plugging all this into the cross-section formula, we now find our final result: 


do do PLY 
ae) a ae 1 — — log — 6.134 
Pal (a)! ee ux | ( ) 
where: 
—4q?/m? if —q?<m? 
xqye} * : (6.135) 
log —g7)m* — Tif —g2 


Now we come to the final step, the comparison of the bremsstrahlung amplitude 
in Eq. (6.77) and the vertex correction for electron scattering. Although they 
represent different physical scattering processes, they must be added because there 
is always an uncertainty in our measuring equipment in measuring soft photons. 
Comparing the two amplitudes, we recall that the bremsstrahlung amplitude in 
Eq. (6.78) was given by: 


= E2 
eal GE lo og yr a Oe e (6.136) 


while the vertex correction graph yields: 


do (do aan —q 
70 = (5a). (1-¢ ~ log —> log =f) (6.137) 


Clearly, when these two amplitudes are added together, we find a finite, conver- 
gent result independent of re as desired. The cancellation of infrared divergences 
to all orders in perturbation theory is a much more involved process. However, 
there are surprising simplifications that give a very simple result for this calculation 
(see Appendix). 


6.9 Lamb Shift 


Two of the great accomplishments of QED were the determination of the anoma- 
lous magnetic moment of the electron, which we discussed earlier, and the Lamb 
shift.'? The fact that these two effects could not be explained by ordinary quantum 
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Figure 6.13. The various higher-order graphs that contribute to the Lamb shift: (a) and 
(b) the electron self-energy graphs; (c) the vertex correction; (d) and (e) the electron mass 
counterterm; (f) the photon self-energy correction. 


mechanics, and the fact that the QED result was so accurate, helped to convince 
the skeptics that QED was the correct theory of the electron—photon system. 

In 1947, Lamb and Retherford demonstrated that the 2S, 2 and the 2 P; 2 energy 
levels of the hydrogen atom were split; the 2 P;;2 energy level was depressed more 
than 1000 MHz below the 2)/2 energy level. (The orginal Dirac electron in a 
classical Coulomb potential, as we saw earlier in Section 4.2, predicted that the 
energy levels of the hydrogen atom should depend only on the principal quantum 
number v and the total spin /, so these two levels should be degenerate.) 

The calculation of the Lamb shift is rather intricate, because we are dealing with 
the hydrogen atom as a bound-state problem, and also because we must sum over 
all radiative corrections to the electron interacting with a Coulomb potential that 
modify the naive ju A° vertex. These corrections include the vertex correction, 
the anomalous magnetic moment, the self-energy of the electron, the vacuum 
polarization graph, and even infrared divergences (Fig. 6.13). 

The original nonrelativistic bound-state calculation of Bethe,'? which ignored 
many of these subtle higher-order corrections, could account for about 1000 MHz 
of the Lamb shift, but only a fully relativistic quantum treatment could calculate 
the rest of the difference. Because of the intricate nature of the calculation, we 
will only sketch the highlights of the calculation. To begin the discussion, we 
first see that the vacuum polarization graph can be attached to the photon line, 
changing the photon propagator to: 


2 2 
g ek 
Diy = = 2 (: 6072 m? 7 01) ie”) 
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This, of course, translates into a shift in the effective coupling of an electron to the 
Coulomb potential.'*!5 Analyzing the zeroth component of this propagator, we 
see that the coupling of the electron to the Coulomb potential changes as follows: 


= = 2 
. 2uUYou . 2UYou aq 2) 
ie E — ie ae (1 75a + O(a ) (6.139) 
To convert this back into x space, let us take the Fourier transform. We know 
that the Fourier transform of 1/q* is proportional to 1/r. This means that the 
static Coulomb potential that the electron sees is given by: 
2 2 2 


a eyre e e 4 
- —_— = — + ——_-h 6.140 
(1 jaa ) 4nr 4nr bi 602 2m? (x) ( ) 


meaning that there is a correction to Coulomb’s law given by QED. This cor- 
rection, in turn, shifts the energy levels of the hydrogen atom. We know from 
ordinary nonrelativistic quantum mechanics that, by taking matrix elements of 
this modified potential between hydrogen wave functions, we can calculate the 
first-order correction to the energy levels of the hydrogen atom due to the vacuum 
polarization graph. 

Now let us generalize this discussion to include the other corrections to the 
calculation of the Lamb shift. Our method is the same: calculate the corrections to 
the vertex function iy,,u, take the zeroth component, and then take the low-energy 
limit. In Eqs. (6.104), (6.132), and (6.133), we saw how radiative corrections 
modified the vertex function with additional form factors F; (q*) and F,(q’). If 
we add the various contributions to the vertex correction, we find: 


E f ag? mo 3 Ul 
meen (MM 


ee |. (6.141) 
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(For example, the vacuum polarization graph contributes the factor —1/5 to the 
vertex correction. The logarithm term comes from the vertex correction, and the yz 
term is eventually cancelled by the infrared correction.) Now take the low-energy 
limit of this expression. The o,,,qg” term reduces down to a spin-orbit correction, 
and we find the effective potential given by: 


4a? me 3 Wes a 
A Veuew Oo (lon = 2 aa 
" sa (oe ) 0+ 6 E ac 


By taking the matrix element of this potential between two hydrogen wave 
functions, we can calculate the energy split due to this modified potential. The 
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vertex correction, for example, gives us a correction of 1010 MHz. The anomalous 
magnetic moment of the electron contributes 68 MHz. And the vacuum polariza- 
tion graph, calculated earlier, contributes —27.1 MHz. Adding these corrections 
together, we find, to the lowest loop level, that we arrive at the Lamb shift to 
within 6 MHz accuracy. 

Since then, higher-order corrections have been calculated, so the difference 
between experiment and theory has been reduced to 0.01 MHz. Theoretically, 
the 2S1/2 level is above the 2P;/2 energy level by 1057.864 + 0.014 MHz. The 
experimental result is 1057.862 + 0.020 MHz. This is an excellent indicator of 
the basic correctness of QED. 


6.10 Dispersion Relations 


So far, we have been discussing the scattering matrix from the point of view of per- 
turbation and renormalization theory, that is, as a sum of increasingly complicated 
divergent Feynman diagrams. However, the calculations and the renormalization 
procedures rapidly become extraordinarily tedious and complicated, involving 
hundreds of Feynman diagrams at the fourth order. 

There is another approach one may take to the scattering matrix that completely 
avoids the complicated and often counterintuitive operations used in renormaliza- 
tion theory. Following Heisenberg and Chew,'® one may also take the approach 
that the S matrix by itself satisfies so many stringent physical requirements that 
perhaps the S matrix is uniquely determined. This approach avoids the formidable 
apparatus that one must introduce in order to renormalize even the simplest quan- 
tum field theory. Although this approach does not give a systematic method to 
construct the S matrix for various physical processes, it does give us rigid con- 
straints that are sometimes strong enough to solve for certain properties of the S 
matrix, such as sum rules and dispersion relations. 

The S matrix, for example, should satisfy the following properties: 


1. Unitarity or conservation of probability. 

2. Analyticity in the various energy variables. 
3. Lorentz invariance. 

4. Crossing symmetry. 


These, in turn, impose an enormous number of constraints on the S matrix, 
independent of perturbation theory, that may give clues to the final answer. One 
of the great successes of this approach was the use of dispersion relations" 
to calculate constraints on the pion—nucleon cross sections. It gave dramatic 
verification of the power of analyticity in deriving nontrivial relations among 
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the cross sections. These reiations were used extensively in strong interaction 
physics where a successful, renormalizable field theory eluded physicists for 
many decades. 

In classical optics, we know that the imposition of causality on the propagation 
of a wave front is sufficient for us to write down a dispersion relation. Classically, 
we find that causality implies that the S matrix found in optics is analytic. Follow- 
ing this analogy to classical optics, we can show that the microscopic causality of 
the Green’s functions is sufficient to prove that the S matrix is analytic and hence 
satisfies certain dispersion relations. 

We know that functions f(z) analytic in the upper half plane obey very strin- 
gent constraints; they can be written as Cauchy contour integrals: 


joe i! i eae eS (6.143) 
20i Jo u—Zz 
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where the contour C is a large closed semicircle sitting on the real axis in the upper 
half complex plane. In the limit that f vanishes sufficiently rapidly at infinity, 
we can drop the integral around the outer semicircle. Notice that z sits in the 
upper half plane. If we slowly let z approach the real axis, then we must take the 
principal part of the integral: 


ae) meee. eo 
f@) = Pa (f. +f) ass <a" 
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=p] ee, (6.144) 
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where C’ is a circular contour integral taken infinitesimally around the point z. 
Taking the real part of both sides, we then have: 


Re f(z) = - P i pp ee) (6.145) 


ee u—zZ 


We can therefore convert this formula to an integral from 0 to oo: 
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ee 


where we assume Im f(z) = —Im f(z). 
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In ordinary nonrelativistic quantum mechanics, we know that the scattering 
amplitude of a wave scattering off a stationary, hard sphere is given by a function 
f(p), which in turn is given as a function of the cross section via the optical 
theorem: 


Im f(p) = pp Owa() (6.147) 


Then we have the forward dispersion relation: 


1 
Re f(p) = = P [- dqt 1 G4) (6.148) 
2n 0 qo pe 
If often turns out that f does not vanish fast enough to make the above relation 


valid. For example, for photon—proton scattering (see Section 6.1), we have, for 
zero frequency: 


a 
NO = (6.149) 


which is the Thompson amplitude. Inserting this into the dispersion relation, we 
have: 


a 1 a 
rie = / Oror(q dg (6.150) 
21 0 


M 
But this is obviously incorrect, because the right-hand side is positive, while 
the left-hand side is not. The error we have made is assuming that f vanishes at 
infinity. Instead, we will write a dispersion relation for f(p)/p, which does vanish 
at infinity. Repeating the same steps as before, we find the following dispersion 
relation: 


LD) me / ACD) 
i eM Tien ad sien ic) oy 


Taking the real part, we obtain our final result: 


2 [oe 
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Now that we have shown how to use the power of analyticity to write dispersion 
relations, let us combine this with the added condition of unitarity to obtain 
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conditions on the § matrix. We know that the scattering matrix satisfies: 


> Sip Sui = 8 fi (6.153) 


By subtracting off the delta function that describes when the particles do not 
interact with each other, the S matrix can be written in terms of a.7 matrix: 


Spi = 8p; — 128) pp — D> PIF: (6.154) 


Inserting the expression for S in terms of .7, and taking the case of forward 
scattering, f =i, we have: 


Ii — Gif = —i(2nyt Y" HP; — Pad Fo Fri (6.155) 


Earlier, in Eqs. (5.21) to (5.26) we derived: 


Otot = 


6 
a Y2ny'5*(P; — Pa )LFil? (6.156) 


where the sum over n includes integrating over final states. Inserting this into the 
unitarity condition, we find: 


1 
ToS ae Oror(@) 


3 Yb 8 (6.157) 


We now wish to apply this formalism to calculate dispersion relations for pion— 
nucleon scattering.'® We first need to define the kinematics of the collision. We 
begin with the scattering of a pion of momentum gq, off a proton of momentum 
Pi, producing a pion of momentum q2 and a proton of momentum pp. 

Let us define the following Mandelstam variables: 


s = (q+pi) 
t = (2-4) 
ue= (q1— pr) (6.158) 


For the case of forward scattering, the most convenient variable is the laboratory 
energy w of the pion: 


yee! i 
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By Lorentz invariance, the matrix . 7 appearing in the transition matrix is given 
by: 


| 
MAGi. Pi) = apr) (46. t)+ 5 (dit 42) B(s, ) u(p) (6.160) 


where nothing 1s known about the form factors A and B. 
In the limit of forward scattering, these two form factors A and B merge into 
one unknown function: 
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4nT(w) = A(s,0)+wB(s, 0) (6.161) 
It can be shown that 7(w) is an analytic function of w. Then we can use cross- 


ing relations to relate the 7*~—nucleon scattering amplitude to the 7~ —nucleon 
scattering amplitude: 


T(w) = T” ?(w) =T™ ?(-a) (6.162) 
Written in this fashion, we can write the unitarity condition as: 
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where k = ,/w* — m2. 


Let us introduce the functions T*: 
1 1 A he 
T* = 5 (T(@) +T(-o)] = 5 [r" P(w) + T* »(w)| (6.164) 


The point of this discussion is to use dispersion relations to write down non- 
trivial constraints for T that can then be compared with experiment. To write 
down the dispersion relations, we must deduce the analytic structure of T, which 
can be done by an analysis of Feynman diagrams. First, if we analyze four-point 
Feynman diagrams, there is a pole corresponding to the exchange of a single 
particle. This pole occurs at wg = m2/2M, which is due to the propagator. 
The intermediate particle corresponding to this pole is off-shell, and the value 
of w is below the energy at which physical scattering takes place. Second, the 
intermediate states become on-shell in the region [m,,, oo] or [—oo, —m,,]. For 
these values of the energy, real scattering takes place with real intermediate states. 
Analytically, this means that there is a Riemann cut in the real w axis. 
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Putting these two facts together, we can now write down the dispersion relation 
for T, consisting of the contour integral and also the pole term: 


—myz 7 co I fh y) 2 
U m 


es aw —-w-i€ '—-w—i€ @W—OB 


where the f term comes from the pole contribution, and: 


2 2 
f2 (£) a (6.166) 


Inserting the unitarity condition, we can rewrite the dispersion relation as: 


1 a V ow? a m2 + a 
ReT (w/o = ZaP i dia [om ?o’) = Gro KD) 
Z 
z ao : (6.167) 
(609 WR 


This is our final result, which agrees well with experiment. 

In summary, we have seen that a straightforward application of Feynman’s 
rules allows us to calculate, to lowest order, the interactions of electrons and pho- 
tons. In this approximation, we find good agreement with earlier, classical results 
and also experimental data. At higher orders, we find disastrous divergences 
that require careful renormalizations of the coupling constant and the masses. 
However, at the one-loop level, QED predicts results for the Lamb shift and the 
anomalous magnetic moment of the electron that are in excellent agreement with 
experimental data. This success, in fact, convinced the scientific community of 
the correctness of QED. 

We will now turn to the conclusion of Part I, which is a proof that QED is 
finite to any order in perturbation theory. 


6.11 Exercises 


1. Prove the Feynman parameter formula in Eqs. (6.75) and (6.105). (Hint: use 
induction.) 


2. Assume that electrons interact with massless neutrinos via the interaction 
Lagrangian gA,Wey"(1 — ys)v,, where A y iS a massive vector field and 
yy is the neutrino field. To second order in the coupling constant, draw all 
Feynman diagrams for the four-point function e~ + v > e~ + v and write 
down the corresponding expression for the scattering formula. Do not solve. 
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10. 


Draw all the Feynman diagrams that appear in the electron—photon vertex 
function as well as the photon propagator to the third-loop order. Do not 
solve. 


. Explicitly reduce out the traces in Eq. (6.45). 
. Fill in the missing steps necessary to prove Eqs. (6.49) and (6.68). 


. Consider the scattering of an electron off a stationary spin-O particle, using 


the Yukawa interaction wy@ to lowest order. Compute the matrix element 
for this scattering. How will the formula differ from the standard Coulomb 
formula? 


. Compute the scattering matrix to lowest order of an electron with a stationary 


neutron if the interaction is given by gyo""’WF,,. Reduce out all gamma 
matrices. How does this result compare with the Coulomb formula? 


. Show by symmetry arguments that the most general coupling between the 


proton and the electromagnetic current is given by: 
fips ( Yu.Filq?) + ix—Oyrq” Faq? 6.168 
Up | Vu NC Onuvd 2(q*) Up (6. ) 
Mp 


where x is the anomalous magnetic moment of the proton in units of e/2mp, 
and F\(q7) and F»(q*) are form factors and functions of the momentum 
transferred squared. 


. From this, derive the Rosenbluth formula for electrons scattering off a sta- 


tionary proton target: 


do - a? cos*(6/2) ot r 
(7). ~ 4E? sin4(6/2) [1 +2E sin*(@/2)/mp| 


2 
x {lr@?)? # re [21Fi(a?) + « F9(q?)|? tan2(6 /2) 


+1 F(q?)!| (6.169) 


where: 


: 4E? sin?(6/2) 


T = 14 (2E/m,)sin2(6/2) ashe 


and where we have assumed that the laboratory energy of the electrons E is 
much larger than m.. 


Prove that II, = 0 using the identities in Eqs. (6.88) and (6.89). 
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11. Prove Eqs. (6.62) and (6.64). Fill in the missing steps. 
12. Prove Eqs. (6.17) and (6.34). 


13. Show that the Mandelstam variables obey the following relationship: 
4 
stt+u=) m} (6.171) 
i=l 


where m; are the masses of the external particles. Show that a four-point 
scattering amplitude has poles for various values of s, t, and u. Show how 
the scattering amplitude changes under crossing symmetry as a function of 
the Mandelstam variables. For example, prove Eq. (6.162). 


14. Consider a higher derivative theory with a Lagrangian given by g(a? — 
(02)°/m?|. Calculate its propagator and show that it corresponds to a 
Pauli—Villars-type propagator; that 1s, it propagates a negative norm ghost. 
Analyze its ultraviolet divergences, if it has any, if we add the interaction 
term ¢*. Show that the Lagrangian can be written in canonical form; that 
is, “ = pq —.#. (Hint: add in auxiliary fields in order to absorb the large 
number of time derivatives in the action.) 


15. Take an arbitrary Feynman graph. Using the Feynman parameter formula, 
prove: 
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(6.172) 
where: 
k 
py = hy >) nies (6.173) 
s=l 


and 7;; equals +1 or zero, depending on the placement of the jth and sth 
lines. If the j line lies in the sth loop, then 7 = +1 (—1) if pj; and q, are 
parallel (antiparallel). Otherwise, n equals zero. 
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16. 


17: 


Prove that the previous integral can be written as: 


b (1 ee a;) 


/ da, ---da, ~ (6.174) 
0 é e > 5 n—2k 
A bane — meu] 


where: 
A = det (» namie (6.175) 
Hat St 


(Hint: choose the k; in order to set the cross-terms to zero: 
n 
> kjajnjs =0 (6.176) 
j=l 


Then diagonalize the integration over q;.) 


In the previous problem, we can make an analogy between a Feynman integral 
and Kirchhoff’s laws found in electrical circuits. Show that we can make the 
analogy: 


k — - current 
a -— resistance 


ka — voltage 


k?a = — power (6.177) 
Show that the statement: 
> ka; =0 (6.178) 
loop 


for momenta taken around a closed loop corresponds to the statement that the 
voltage around a closed loop is zero. Second, we also have the equation: 


DE kj =0 (6.179) 


vertex 
which is the statement of current conservation. Because the Feynman param- 
eters @ are strictly real numbers, show that capacitors and inductors are not 
allowed in the circuit. In this way, we can intuitively analyze the analytic 
structure of a Feynman graph. 


Chapter 7 
Renormalization of QED 


The war against infinities was ended. There was no longer any reason 
fo fear the higher approximations. The renormalization took care of all 
infinities and provided an unambiguous way to calculate with any desired 
accuracy any phenomenon resulting from the coupling of electrons with 
the electromagnetic field .... It is like Hercules’ fight against Hydra, the 
many-headed sea monster that grows a new head for every one cut off. 


But Hercules won the fight, and so did the physicists. 
—V. Weisskopf 


7.1 The Renormalization Program 


One of the serious complications found in quantum field theory is the fact that 
the theory is naively divergent. When higher-order corrections are calculated for 
QED or ¢* theory, one finds that the integrals diverge in the ultraviolet region, for 
large momentum p. 

Since the birth of quantum field theory, several generations of physicists 
have struggled to renormalize it. Some physicists, despairing of ever extracting 
meaningful information from quantum field theory, even speculated that the theory 
was fundamentally sick and must be discarded. In hindsight, we can see that the 
divergences found in quantum field theory were, in some sense, inevitable. In the 
transition from quantum mechanics to quantum field theory, we made the transition 
from a finite number of degrees of freedom to an infinite number. Because of 
this, we must continually sum over an infinite number of internal modes in loop 
integrations, leading to divergences. The divergent nature of quantum field theory 
then reflects the fact that the ultraviolet region is sensitive to the infinite number 
of degrees of freedom of the theory. Another way to see this is that the divergent 
graphs probe the extremely small distance region of space-time, or, equivalently, 
the high-momentum region. Because almost nothing is known about the nature of 
physics at extremely small distances or momenta, we are disguising our ignorance 
of this region by cutting off the integrals at small distances. 
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Since that time, there have been two important developments in renormaliza- 
tion theory. First was the renormalization of QED via the covariant formulation 
developed by Schwinger and Tomonaga and by Feynman (which were shown to 
be equivalent by Dyson'). This finally showed that, at least for the electromag- 
netic interactions, quantum field theory was the correct formalism. Subsequently, 
physicists attacked the problem of the strong and weak interactions via quantum 
field theory, only to face even more formidable problems with renormalization 
that stalled progress for several decades. 

The second revolution was the proof by ’t Hooft that spontaneously broken 
Yang-Mills theory was renormalizable, which led to the successful application of 
quantum field theory to the weak interactions and opened the door to the gauge 
revolution. 

There have been many renormalization proposals made in the literature, but all 
of them share the same basic physical features. Although details vary from scheme 
to scheme, the essential idea is that there is a set of “bare” physical parameters that 
are divergent, such as the coupling constants and masses. However, these bare 
parameters are unmeasurable. The divergences of these parameters are chosen so 
that they cancel against the ultraviolet infinities coming from infinite classes of 
Feynman diagrams, which probe the small-distance behavior of the theory. After 
these divergences have been absorbed by the bare parameters, we are left with the 
physical, renormalized, or “dressed” parameters that are indeed measurable. 

Since there are a finite number of such physical parameters, we are only 
allowed to make a finite number of such redefinitions. Renormalization theory, 
then, is a set of rules or prescriptions where, after a finite number of redefinitions, 
we can render the theory finite to any order. 

(If, however, an infinite number of redefinitions were required to render all 
orders finite, then any quantum field theory could be “renormalized.” We could, 
say, find a different rule or prescription to cancel the divergences for each of 
the infinite classes of divergent graphs. Unless there is a well-defined rule that 
determines how this subtraction is to be carried out to all orders, the theory is not 
well defined; it is infinitely ambiguous.) 

We should stress that, although the broad features of the renormalization 
program are easy to grasp, the details may be quite complicated. For example, 
solving the problem of “overlapping divergences” requires detailed graphical 
and combinatorial analysis and is perhaps the most important complication of 
renormalization theory. Due to these tedious details, there have been several 
errors made in the literature concerning renormalization theory. (For example, 
the original Dyson/Ward proof of the renormalization of QED actually breaks 
down at the 14th order*? because of overlapping diagrams. The original claims 
of renormalization were thus incomplete. However, the proof can presumably be 
patched up.) 
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Mindful of these obscure complications, which tend to conceal the relatively 
simple essence of renormalization theory, in this chapter we will first try to 
approach the problem of renormalization from a schematic point of view, and 
then present the details later. Instead of presenting the complications first, which 
may be quite involved, we will discuss the basic components of renormalization 
theory, which occur in four essential steps: 


1. Power counting 

By simply counting the powers of p in any Feynman graph, we can, for large 
p, tell whether the integral diverges by calculating the degree of divergence of 
that graph: each boson propagator contributes p~?, each fermion propagator 
contributes p~', each loop contributes a loop integration with p*, and each 
vertex with n derivatives contributes at most n powers of p. If the overall 
power of p; that is, the degree of divergence D, is 0 or positive, then the 
graph diverges. By simple power counting arguments, we can then calculate 
rather quickly whether certain theories are hopelessly nonrenormalizable, or 
whether they can be potentially renormalized. 


2. Regularization 
Manipulating divergent integrals is not well defined, so we need to cutoff 
the integration over d*p. This formally renders each graph finite, order by 
order, and allows us to reshuffle the perturbation theory. At the end of the 
calculation, after we have rearranged the graphs to put all divergent terms into 
the physical parameters, we let the cutoff tend to infinity. We must also show 
that the resulting theory is independent of the regularization method. 


3. Counterterms or multiplicative renormalization 
Given a divergent theory that has been regularized, we can perform formal 
manipulations on the Feynman graphs to any order. Then there are at least 
two equivalent ways in which to renormalize the theory: 

First, there is the method of multiplicative renormalization, pioneered by 
Dyson and Ward for QED, where we formally sum over an infinite series of 
Feynman graphs with a fixed number of external lines. The divergent sum is 
then absorbed into a redefinition of the coupling constants and masses in the 
theory. Since the bare masses and bare coupling constants are unmeasurable, 
we can assume they are divergent and that they cancel against the divergences 
of corresponding Feynman graphs, and hence the theory has absorbed all 
divergences at that level. 

Second, there is the method of counterterms, pioneered by Bogoliubov, 
Parasiuk, Hepp, and Zimmerman (BPHZ), where we add new terms directly 
to the action to subtract off the divergent graphs. The coefficients of these 
counterterms are chosen so that they precisely kill the divergent graphs. In a 
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renormalizable theory, there are only a finite number of counterterms needed 
to render the theory finite to any order. Furthermore, these counterterms are 
proportional to terms in the original action. Adding the original action with the 
counterterms gives us a renormalization of the masses and coupling constants 
in the action. These two methods are therefore equivalent; that is, by adding 
counterterms to the action, they sum up, at the end of the calculation, to give 
a multiplicative rescaling of the physical parameters appearing in the action. 

These methods then give us simple criteria that are necessary (but not 
sufficient) to prove that a theory is renormalizable: 


a. The degree of divergence D of any graph must be a function only of the 
number of external legs; that is, it must remain constant if we add more 
internal loops. This allows us to collect all N-point loop graphs into 
one term. (For super-renormalizable theories, the degree of divergence 
actually decreases if we add more internal loops). 


b. The number of classes of divergent N-point graphs must be finite. These 
divergences must cancel against the divergences contained within the bare 
parameters. 


4. Induction 

The last step in the proof of renormalizability is to use an induction argu- 
ment. We assume the theory is renormalizable at the nth order in perturbation 
theory. Then we write down a recursion relation that allows us to generate 
the n + Ist-order graphs in terms of the nth-order graphs. By proving the 
n + |st-order graphs are all finite, we can prove, using either multiplicative 
or counterterm renormalization, that the entire perturbation theory, order by 
order, is finite. Since there are various recursion relations satisfied by field 
theory (e.g., Schwinger-Dyson equations, renormalization group equations, 
etc.), there are also a variety of renormalization programs. However, all 
induction proofs ultimately rely on Weinberg’s theorem (which states that a 
Feynman graph converges if the degree of divergence of the graph and all its 
subgraphs is negative). 


7.2 Renormalization Types 
Based on simple power counting of graphs, we can begin to catalog a wide variety 
of different field theories on the basis of their renormalizability. We will group 


quantum field theories into four distinct categories: 


1. Nonrenormalizable theories. 
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2. Renormalizable theories. 
3. Super-renormalizable theories. 


4. Finite theories. 


7.2.1 Nonrenormalizable Theories 


To determine the degree of divergence of any graph, we need to know the dimen- 
sion of the various fields and coupling constants. We can determine the dimension 
of a field by analyzing the behavior of the propagator at large momenta, or by 
analyzing the free action. We demand that the action has the dimension of f, 
(i.e., has zero dimension). Since the volume element d*x has dimension cm‘, this 
means that the Lagrangian must have dimension cm~*. 

For example, the Klein—Gordon action contains the term (0,)*. Since the 
derivative has dimension cm™!, it means that ¢ also has dimension cm~!. The 
mass m has dimensions cm~! in our units, so that m7 has the required dimension 
(cm)~*. By the same reasoning, the massless Maxwell field also has dimension 
(cm)~!. The Dirac field, however, has dimension cm~7/*, so that the term wi dy 
has dimension cm~*. 

It is customary to define the dimension of a field in terms of inverse centimeters 
(or, equivalently, grams). If [@] represents the dimension of the field in inverse 
centimeters, then the dimensions of the fields in d space-time dimensions can be 


easily computed by analyzing the free action, which must be dimensionless: 


[¢] 


[Ww] GA) 


The simplest example of a nonrenormalizable theory is one that has a coupling 
constant with negative dimension, like ¢° theory in four dimensions. To keep 
the action dimensionless, the coupling constant g must have dimension —1. Now 
let us analyze the behavior of an N-point function. If we insert a g@> vertex 
into the N-point function, this increases the number of g’s by one, decreasing 
the dimension of the graph. This must be compensated by an increase in the 
overall power of k by 1, which increases the dimension of the graph, such that 
the total dimension of the graph remains the same. Inserting a vertex into the 
N-point graph has thus made it more divergent by one factor of k. By inserting an 
arbitrary number of vertices into the N point function, we can arbitrarily increase 
the overall power of k and hence make the graph arbitrarily divergent. The same 
remarks apply for @” for n > 4 in four dimensions. 
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(Since the presence or absence of dimensional coupling constants depends so 
crucially on the dimension of space-time, we will find that in different spacetimes 
the set of renormalizable and nonrenormalizable theories are quite different.) 

Some examples of non-renormalizable theories include: 


1. Nonpolynomial actions 
These actions have an infinite number of terms in them, and typically look like 
Sar o”. They necessarily have coupling constants with negative dimension, 
and hence are not renormalizable. 


2. Gravity 
Quantum gravity has a coupling constant « with negative dimension. (k? ~ 
Gy, where Gy is Newton’s constant, which has dimension —2.) This means 
that we cannot perform the standard renormalization program. Also, a power 
expansion in the coupling constant yields a nonpolynomial theory. Thus, 
quantum gravity is not renormalizable and is infinitely ambiguous. 


3. Supergravity 
By the same arguments, supergravity is also nonrenormalizable. Even though 
it possesses highly nontrivial Ward identities that kill large classes of diver- 
gences, the gauge group is not large enough to kill all the divergences. 


4. Four-fermion interactions 
These actions, like the original Fermi action or the Nambu—Jona—Lasinio 
action, contain terms like (yy)*. By power counting, we know that y has 
dimensions cm~*/”, so the four-fermion action has dimension 6. This requires 
a coupling constant with dimension —2, so the theory is nonrenormalizable. 


5. GoM, 
This coupling, which seems to be perfectly well defined and gauge invariant, 
is not renormalizable because it has dimension 5, so its coupling constant has 


dimension —1. 


6. Massive vector theory with non-Abelian group 


A propagator like: 
re kuky 
an fav M2 
"2 — M2 +ie 2) 


goes like O(1) for large k, and hence does not damp fast enough to give us 
a renormalizable theory. So the theory fails to be renormalizable by power 
counting arguments. 
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7. Theories with anomalies 4 
Ward—Takahashi identities are required to prove the renormalizability of gauge 
theories. However, sometimes a classical symmetry of an action does not 
survive the rigorous process of quantization, and the symmetry is broken. 
We say that there are anomalies in the theory that destroy renormalizability. 
Anomalies will be studied in greater length in Chapter 12. 


7.2.2  Renormalizable Theories 


The renormalizable theories only form a tiny subset of possible quantum field 
theories. They have only a finite number of counterterms. They also have no 
dimensional coupling constants; so the dimension of each term in the Lagrangian 
ison. 

Some well-known renormalizable theories include: 


Lage 
This is the simplest renormalizable theory one can write in four dimensions. 
Because ¢ has dimension 1, this interaction has dimension 4, and hence the 
coupling constant is dimensionless. This theory is a prototype of much more 
complicated actions. (However, it should be pointed out that this theory, when 
summed to all orders, is probably a free theory.) 


2. Yukawa theory 
The Yukawa theory has a coupling between fermions and scalars given by: 


FZ = gyriyg" (7.3) 


where T° is the generator of some Lie group and g is dimensionless. 


3. Massive vector Abelian theory 
Although this theory has a propagator similar to the massive vector non- 
Abelian gauge theory, the troublesome k,,k, term drops out of any Feynman 
graph by U(1) gauge invariance. (This term cannot be dropped in a gauge 
theory with a non-Abelian group.) 


4. QED 
There are no dimensional coupling constants, and, by power counting argu- 
ments, we need only a finite number of counterterms. 


5. Massless non-Abelian gauge theory 
By power counting, this theory is renormalizable. A more detailed proof of 


the renormalizability of this theory will be shown in Chapter 13. 
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6. Spontaneously broken non-Abelian gauge theory 

Although massive non-Abelian gauge theories are, in general, nonrenormal- 
izable, there is one important exception. If the gauge symmetry of a massless 
non-Abelian theory is spontaneously broken, then renormalizability is not 
destroyed. The fact that renormalizability persists even after the gauge group 
is spontaneously broken helped to spark the gauge revolution. 

Since the renormalizability of a theory is dependent on the dimension of 
space-time, we also have the following renormalizable theories: 


7. oYg? 


This is renormalizable in three dimensions. 


8. ¢ 


This is renormalizable in six dimensions. 


9. ~°, p*® 


These are renormalizable in three dimensions. 


10. (wy)? 
This is renormalizable in two dimensions. Although this interaction is non- 
renormalizable in four dimensions, in two dimensions it requires no dimen- 
sional coupling constant. 
Finally, we can write down the complete set of interactions for spin 0, , and 
1 fields that are potentially renormalizable. In four dimensions, they are given 
symbolically by (if we omit isospin and Lorentz indices): 


o, wwe, (AY, VAY, o'a,0A"%, G1GA? (7.4) 


(The Yang-Mills theory has an additional 0A? interaction, which we will 
discuss separately.) 


7.2.3 Super-renormalizable Theories 


Super-renormalizable theories converge so rapidly that there are only a finite 
number of graphs that diverge in the entire perturbation theory. The degree of 
divergence actually goes down as we add more internal loops. The simplest 
super-renormalizable theories have coupling constants with positive dimension, 
such as ¢° in four dimensions. Repeating the argument used earlier, this means 
that increasing the order g of an N-point function must necessarily decrease the 
number of momenta k appearing in the integrand, such that the overall dimension 
of the graph remains the same. Thus, as the order of the graph increases, sooner 
or later the graph becomes convergent. Thus, there are only a finite number of 
divergent graphs in the theory. 
Some examples of super-renormalizable theories include: 
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lewep? 
This is super-renormalizable in four dimensions (but the theory is sick because 
it is not bounded from below and the vacuum is not stable). 


2. 4 
This is super-renormalizable in three dimensions because it has three superfi- 
cially divergent graphs, which contribute to the two-point function. 


3. P(p) 
@ has zero dimension in two dimensions; so we can have an arbitrary poly- 
nomial in the action yet still maintain renormalizability. The interaction 
P(@) produces only a finite number of divergences, all of them due to self- 
contractions of the lines within the various vertices. 


4. Pio)wy 


This is also super-renormalizable in two dimensions. 


7.2.4 Finite Theories 


Although Dirac was one of the creators of quantum field theory, he was dissatisfied 
with the renormalization approach, considering it artificial and contrived. Dirac, 
in his later years, sought a theory in which renormalization was not necessary at 
all. Dirac’s verdict about renormalization theory was, “This is just not sensible 
mathematics. Sensible mathematics involves neglecting a quantity when it turns 
out to be small—not neglecting it because it is infinitely great and you do not want 
itl? 

Instead, Dirac believed that a new theory was needed in which renormaliza- 
tions were inherently unnecessary. Until recently, it was thought that Dirac’s 
program was a dead end, that renormalization was inherent within any quantum 
field theory. However, because of the introduction of supersymmetry, we have 
two possible types of theories that are finite to any order in perturbation theory: 


1. Super Yang-Mills theory 

Supersymmetry gives us new constraints among the renormalization constants 
Z that are not found in ordinary quantum theories. In fact, for the SO(4) super 
Yang-Mills theory, one can show that these constraints are enough to guar- 
antee that Z = 1 for all renormalization constants to all orders in perturbation 
theory. In addition, the SO(2) super Yang-Mills theory, coupled to certain 
classes of supersymmetric matter, is also finite to all orders in perturbation 
theory. Although these super Yang—Mills theories are uninteresting from the 
point of view of phenomenology, the fact that supersymmetry is powerful 
enough to render certain classes of quantum field theories finite to all orders 
is reason enough to study them. 
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2. Superstrings 

Supersymmetry also allows us to construct actions much more powerful than 
the super Yang-Mills theory, such as the superstring theory. Superstring theory 
has two important properties. First, it is finite to all orders in perturbation 
theory and is free of all anomalies. Second, it contains quantum gravity, 
as well as all known forces found in nature, as a subset. The fact that 
superstring theory is the only candidate for a finite theory of quantum gravity 
is remarkable. (Because there is no experimental evidence at all to support the 
existence of supersymmetry, we will discuss these supersymmetric theories 
later in Chapters 20 and 21.) 


7.3 Overview of Renormalization in ¢* Theory 


Since renormalization theory is rather intricate, we will begin our discussion by 
giving a broad overview of the renormalization program for the simplest quantum 
field theory in four dimensions, the gt theory, and then for QED. We will stress 
only the highlights of how renormalization is carried out. After sketching the 
overall renormalization program for these two theories, we will then present the 
details, such as the regularization program and the induction argument. 

Our goal is to present the arguments that show that (1) the degree of divergence 
D of any Feynman graph in ¢* theory is dependent only on the number of 
external lines, and that (2) these divergent classes can be absorbed into the physical 
parameters. 

Given any graph for ¢*, we can analyze its divergent structure by power 
counting as follows: 


E = numberof external legs 
{ = number of internal lines 
= number of vertices 
= number of loops (7.5) 


The degree of divergence of an arbitrary Feynman graph is easily computed. 
Each internal propagator contributes |/p*, while each loop contributes d‘p, so 
the degree of divergence is given by: 


D=4L-2I (7.6) 


Now we use some simple counting arguments about graphs to reduce this 
expression. Each vertex has four lines connecting to it. Each of these lines, in 
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turn, either ends on an external leg, or on one end of an internal leg, which has 
two ends. Thus, we must have: 


4V =21+E (7.7) 


Also, the loop number L can be calculated by analyzing the independent momenta 
in any Feynman graph. The number of independent momenta is equal to the 
number of internal lines / minus the constraints coming from momentum conser- 
vation. There are V such momentum constraints, minus the overall momentum 
conservation from the entire graph. Since the number of independent momenta in 
a Feynman graph is also equal to the number of loop momenta, we have, therefore: 


Eh es se | (7.8) 


Inserting these graphical rules into our expression for the divergence of a Feynman 
graph, we now have: 


D=4-E (7.9) 


This means that the degree of divergence of any graph in four dimensions is 
strictly dependent on the number of external lines, which is a necessary condition 
for renormalizability. The degree of divergence is hence independent of the 
number of internal loops in the graph. 

This also a gratifying result because it means that only the two-point and the 
four-point graphs are divergent. This, in turn, gives us the renormalization of the 
two physical quantities in the theory: the mass and the coupling constant. Thus, by 
using only power counting arguments, in principle we can renormalize the entire 
theory with only two redefinitions corresponding to two physical parameters. [In 
d dimensions, however, the degree of divergence is D = d+(1—d/2)E+(d—4)V. 
Because D increases with the number of internal vertices, there are problems with 
renormalizing the theory in higher dimensions. } 

Next, we want to sketch the two methods by which renormalization is carried 
out: multiplicative renormalization and counterterms. For ¢* theory, we first 
begin with an elementary discussion of multiplicative renormalization. 

We start with an action defined with “unrenormalized” or “bare” coupling 
constants and masses: 

LZ = sudo d! oe = 5m = a6 (7.10) 
(We remind the reader that mo and Ao are formally infinite, but are not measurable. ) 
We define 5(p”) to be the sum of all proper (or one-particle irreducible) two-point 
graphs. (A graph is called improper or one-particle reducible if we can, by cutting 
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A’ 3 


Figure 7.1. The complete propagator A’ is the sum of an infinite chain of one-particle 
irreducible graphs ©. A’ itself is improper. 


just one line, break the graph into two distinct parts. A proper diagram cannot be 
split into two parts by cutting one line.) For example, the first-loop contribution 
to 0(p”) is given by the following: 


Ao d*p 1 


aris 2 ey ein er 
= | Oxy po ee 


(7.11) 


which is obtained by taking a four-point vertex and joining two legs together into 
a loop. It is a proper graph because if we cut the loop, it becomes a four-point 
vertex and hence does not split apart into two distinct pieces. 

Now let A’(p) represent the sum over all possible two-point graphs. We call it 
the complete or full propagator. A’(p) is obviously the sum of two parts, a proper 
and improper part. By definition, the proper part cannot be split any further by 
cutting an intermediate line, but the improper part can. If we split the improper 
part in half, then each piece, in turn, is the sum of a proper and improper part. 
Then the smaller improper part can be further split into a proper and improper part. 
By successively cutting all improper pieces into smaller pieces, we can eliminate 
all improper parts and write the complete propagator entirely in terms of proper 
parts. This successive cutting process obviously creates a sequence of proper parts 
strung together along a string, as in Figure 7.1. A‘(p) itself is improper, since it 
can be split into two pieces by cutting one line. 

We can iterate the complete self-energy graph A’(p7) an infinite number of 
times with respect to X, so that we can formally sum the series, using the fact that: 


= me (7.12) 


The sum equals: 


iAr(p)+iAr(p)l-iX(p?)iAr(p)--- 


iantp)( 


iA’(p) 


l 
I+ CE, 
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i 
- [ae (7.13) 


Now power expand this self-energy correction D(p) around p? = m?, where m 
is, at this point, finite but arbitrary: 


2(p*) = D0?) + (p? — m?yd'(m?) + Dp?) (7.14) 


where 3p") ~ O(p? — m°)-, and where (m7?) and D’(m?) are divergent. The 
net effect of summing this infinite series of graphs is that the complete propagator 
is now modified to: 


rN tp) = ee 
Pe ip — Dn) (Bp? — m2) ENP) — Ui pe) ie 
i 
p? — [ms + X(m?)] ~— (p? — m2)E!(m?) — 3(p?) + ie 


I 
[1 — 5/(n2)|\(p? — m2) — E(p?) + ie (7.15) 


At this point, mo is infinite but arbitrary. Since (m7) is also divergent, we will 
define mg and m such that mo cancels against the divergent part coming from 
d(m?), giving us the finite piece m. We will choose: 


m2 + X(m?) = m? (7.16) 


[There is a certain arbitrariness in how two infinite terms cancel, because an 
infinite term plus a finite term is still infinite. D(m*) may have a finite piece in 
addition to an infinite piece, so the value of m? is, at this point, arbitrary. We will 
comment on this important ambiguity later. ] 

With this choice, we have: 


iZs 


es ee (7.17) 
p? — m? — X(p?) + ie 


iA’(p) = 


where we have made the following redefinitions: 


m= = m2+>D(m*)=m3+ dm? 
1 
Ze = 1 — (m2) 


S(p2) = S(p2)[1 — E(m’y)7! = ZgE(p?) (7.18) 
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Since the divergence of the propagator is contained within Zg, we can extract 
this divergent piece. From the unrenormalized propagator A’, we will remove 
this divergence and hence obtain the renormalized propagator A'(p): 


A'(p) = ZgA(p) (7.19) 


This simple example demonstrates some of the main features of renormal- 
ization theory. First, the pole structure of the bare propagator has changed by 
summing up all graphs. Although we started with a bare propagator A-(p) with 
a simple pole at the (infinite) bare mass mo, the effect of summing all possible 
graphs is to shift the bare mass to the renormalized or “dressed” mass m. The p 
dependence of the complete propagator A’ has changed in a nontrivial fashion. It 
no longer consists of just a simple pole. However, the complete propagator A’ 
still has a pole at the shifted mass squared p? = m? because (m7) = 0, 

Second, the divergent unrenormalized propagator A’(p) has been converted 
to the convergent, renormalized propagator A’(p) by a multiplicative rescaling by 
Z¢. This is crucial in our discussion of renormalization, because it means that we 
can extract the divergence of self-energy graphs by a simple rescaling (which will 
eventually be absorbed by redefining the physical coupling constants and wave 
functions of the theory). 

We could also have written this in the language of proper vertices. Let 
I) represent the one-particle irreducible n-point vertex function, with the n 
propagators on the external lines removed. (The proper vertex will be defined 
more rigorously in Chapter 8.) In the free theory, the proper vertex for the 
two-point function is defined as the inverse of the propagator: 


iT? (p) = p? — m2 (7.20) 


Once we sum to all orders in perturbation theory, then the proper vertex function 
becomes infinite. If we divide out by this infinite factor Z4, then we can write the 
renormalized vertex as: 


iT(0) = —m? (7.21) 


where we have taken p = 0. 

Now consider the effect of renormalization on the coupling constant Ag. Let 
I represent the four point proper vertex, summed over all possible graphs, with 
propagators on the external lines removed. To lowest order, this four-point graph 
equals: 


iT) (pi) = Ao (7.22) 
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The one-loop correction to this is given by: 


d*p 


oe = Ao - Ce aS a ee 
\(p — q)? — me + iel[p? — m2 + ie] 


22 
xh 0 (7.23) 
Although we cannot evaluate [ to all orders in perturbation theory, we know that 
it is Lorentz invariant and hence can be written in terms of the three Mandelstam 
variables: 


s=(pit po); t=(pi+p3); u=(pi+ psy (7.24) 


Therefore: 


iT (p;) =d0 + f(s) + f(t) + fu) (7.25) 


for some divergent function f. For the value p = 0, let Z, ' be this overall infinite 
factor contained within f. If we divide out the infinite factor from the vertex 
function, then we have: 


iT(0) =a (7.26) 


where A is the physical, renormalized coupling constant. In fact, we can take Eqs. 
(7.21) and (7.26) to be the definition of the mass and coupling constant, measured 
at the point p = 0. Because of the ambiguity in manipulating finite and infinite 
quantities, this definition is not unique. We could have defined the physical mass 
and coupling constant at some arbitrary momentum scale jz as well as p = 0. In 
other words, these quantities are actually functions of ~. (Of course, when we 
perform an experiment and actually measure the physical masses and coupling 
constants, we do so at a fixed momentum, so there is no ambiguity. However, 
if these experiments could be performed at different momenta, then the effective 
physical constants may change.) 

For example, we could have defined the vertex function at the point p* = m 
and s = u = t = 4m7/3. In general, we can define the masses and coupling 
constants at a different momentum p = py Via: 


2 


iH] 


p? —m*(p) 
A(t) (7.27) 


iP (2) 


iT(u) 


This point yz is called the renormalization point or the subtraction point. The 
ambiguity introduced by this subtraction point jz will appear repeatedly throughout 
this book, and will be studied in more detail in Chapter 14. 
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Now that we have isolated the divergent multiplicative quantities, we can 
easily pull them out of any divergent Feynman diagram by redefining the coupling 
constants and masses. To see how this is done, let us split the Zy factor occurring 
with every renormalized propagator in a Feynman diagram into two pieces (,/Z4)°. 
In this diagram, move each factor of VL, into the nearest vertex function. Since 
each vertex function has four legs, it means that the renormalized vertex function 
will receive the contribution of four of these factors, or \/Zg = ie Since the 
renormalization of the vertex function contributes an additional factor of Z,” r 
then the Ao sitting in front of the vertex function picks up a factor of Ze {2 xeBut 
this means that the original bare coupling constant Apo is now modified by this 
multiplicative renormalization as follows: 


ho 2A =A0Z,'Z5 (7.28) 


which we define to be renormalized coupling constant. 

In this way, we can move all factors of Zy into the various vertex functions, 
renormalizing the coupling constant, except for the propagators that are connected 
to the external legs. We have a left over factor of ,/Zg for each external leg. This 
last factor can be eliminated by wave function renormalization. (As we saw in 
Chapter 5, it was necessary to include a wave function renormalization factor 
Z~'/? in the definition of “in” and “out” states.) Since the wave function is 
not a measurable quantity, we can always eliminate the last factors of Zg by 
renormalizing the wave function. 

In summary, we first began with the ¢* theory defined totally in terms of 
the unrenormalized, bare coupling constants and masses, ms and Ao, which are 
formally infinite. Then, by summing over infinite classes of diagrams, we found 
that the modified propagator had a pole at the shifted renormalized mass: p* = m?. 
Furthermore, the propagator and vertex function were multiplied in front by infinite 
renormalization constants Zg and Z, ', respectively. By splitting up the Zy at 
each propagator into two pieces, we could redistribute these constants such that 
the new coupling constant became AoZG /Z,. The reshuffling of renormalization 
constants can be summarized as follows: 


PZ, aie 
A = Zee 
m> = mo+6m? (7.29) 


where 6m? = (m7). 

This can also be generalized to the n-point function as well. If we take an 
unrenormalized vertex re Pi, Ao, Mo), we can begin summing the graphs within 
the unrenormalized vertex to convert it to the renormalized one. In doing so, we 
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pick up extra multiplicative factors of Zg and Z,, which allow us to renormalize 
the coupling constant and shift the bare mass. There are n external legs within 
RY that have no propagators, and hence do not contribute their share of the Z!/* 


gd 
factors. This leaves us with an overall factor of a > Then we are left with: 


Ds (pi, Ao, mo) = Zoos A,m, |) (7.30) 


Although renormalization of ¢* seems straightforward, unfortunately, there 
are two technical questions that remain unanswered. 

First, as we mentioned earlier, a more rigorous proof must grapple with the 
problem of overlapping divergences, which is the primary source of complication 
in any renormalization program. Salam‘ was perhaps the first to fully appreciate 
the importance of these overlapping divergences. Because of these divergences, 
the final steps needed to renormalize ¢* are, in some sense, more difficult than 
the renormalization of QED. As a result, the problem of overlapping divergences 
for ¢* theory will be discussed at length later in this chapter. We will then solve 
the problem of overlapping divergences in Chapter 13, when we study the BPHZ 
renormalization method. We will then complete the renormalizability of 64 theory 
in Chapter 14, where we develop the theory of the renormalization group. 

Besides the multiplicative renormalization method, there is yet another way to 
perform renormalization, and this is to proceed backwards, that is, start with the 
usual ¢* action defined with the physical coupling constants and masses, which, 
of course, are finite. Then, as we calculate Feynman diagrams to each order, we 
find the usual divergences. The key point is that we can cancel these divergences 
by adding counterterms to the original action (which are proportional to terms in 
the original action). This second method is called the counterterm method. 

These counterterms contribute new terms to the Feynman series that cancel the 
original divergences to that order. At the next order, we then find new divergences, 
so we add new counterterms (again, proportional to terms in the original action), 
which cancel the divergences to that order. The final action is then the original 
renormalized action plus an infinite sequence of counterterms, to all orders. Be- 
cause all the counterterms are proportional to terms in the original action, we wind 
up with the unrenormalized action defined in terms of unrenormalized parameters 
(which was the starting point of the previous procedure). 

To see the close link between the multiplicative renormalization approach and 
the counterterm approach, let us start with the renormalized action: 


] X 
B= 5 la.) — m7] — ae (7.31) 


where A and m are the renormalized, finite quantities. We then find divergences 
with this action, so we add counterterms to the action, such that these counterterms 
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cancel the divergences that appear to that order. Since the infinities that arise are 
similar to the infinities we encountered with the multiplicative renormalization 
program, we find that the coefficients of the counterterms can be summed to yield 
the quantities Zy, Z,, and 5m? found earlier. 

To see this, let us add the counterterm A.Y to the Lagrangian: 


BB+ A0PF (7.32) 


where the counterterm % must cancel the divergences coming from the two-point 
and four-point graphs, which are contained within D(p?) and if(p). Using the 
subtraction point yz = 0, we find that the counterterm that cancels these divergences 
can be written as (to lowest loop order): 


x0 x0 
see gt a (ind) + 


iT(0) 
4! 


g* (7.33) 


Now make the definitions: 


yO) = Z,-1 
o(0) = —(Zg—1)m? +6m? 
OS ee Ol ara) (7.34) 


which are equivalent (to lowest loop order) to the definitions in Eq. (7.18). 
With these definitions, we now have: 


ee — 


oe 2 A) dm? 5) 
A 314e — 1)[(0,.)° — m*g*] + = 26% — 41 Ss (7.35) 


If we add Y and A.¥ together, we find that we retrieve the original unrenormalized 
action %: 


y= F+ ag (7.36) 


This intuitively shows the equivalence of the multiplicative renormalization and 
the counterterm methods. Thus, it is a matter of taste which method we use. In 
practice, however, this second method of adding counterterms is perhaps more 
widely used. The point is, however, that both the multiplicative approach and 
the counterterm approach are equivalent. The fact that the counterterms were 
proportional to terms in the original action made this equivalence possible. 
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7.4 Overview of Renormalization in QED 


Let us now begin an overview of the renormalization of QED. As in any quantum 
field theory, the first step involves power counting. Once again, our goal is to 
show that the degree of divergence of any graph is independent of the number of 
internal loops. 

Let us count the superficial degree of divergence of each graph in QED. We 
define: 


= number of loops 


V = number of vertices 
Ey = number of external electron legs 
I, = number of internal electron legs 
E,4 = number of external photon legs 
7, = number of internal photon legs (7.37) 


Then the superficial degree of divergence is: 

D=4AL— 2i,— 1, (7.38) 
We can rewrite this equation so that it is only a function of the external legs of 
the graph, no matter how many internal legs or loops it may have. Each vertex, 


for example, connects to one end of an internal electron leg. For external electron 
legs, only one end connects onto a vertex. Thus: 


1 
V=ly+ 5Ey (7.39) 


Likewise, each vertex connects to one end of an internal photon line, unless it is 
external. Thus: 


VeE2ig+Ea (7.40) 
Also, we know that the total number of independent momenta is equal to L, which 
in turn equals the total number of internal lines in the graph minus the number of 


vertices (since we have momentum conservation at each vertex) plus one (since 
we also have overall momentum conservation). Thus: 


B=iyp+i,—V +1 (7.41) 
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D=1 D=0 D=2 
D=1 D=1 D=0 


Figure 7.2. The list of divergent classes of graphs in QED. Only the first three graphs, 
however, are truly divergent if we use gauge invariance and Furry’s theorem. 


Putting everything together, we find the final formula: 


This is very fortunate, because once again it shows that the total divergence of any 
Feynman graph, no matter how complicated, is only dependent on the number of 
external electron and photon legs. In fact, we find that the only graphs that diverge 
are the following (Fig. 7.2): 


1. Electron self-energy graph (D = 1). 
2. Electron—photon vertex graph (D = 0). 


3. Photon vacuum polarization graph (D = 2) (by gauge invariance, one can 
reduce the divergence of this graph by two). 


4. Three-photon graph (D = 1)—these graphs cancel because the internal elec- 
tron line travels in both clockwise and counterclockwise direction. (If the 
internal fermion line is reversed, then this can be viewed as reversing the 
charge of the electron at the vertex. Since there are an odd number of vertices, 
the overall sign of the graph flips.) Then the two graphs cancel against each 
other by Furry’s theorem. 


5. Four-photon graph (D = 0)—this graph is actually convergent if we use gauge 
invariance. 
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S 1 
F bs 
+@- = —_ > + +@- 


Figure 7.3. The complete propagator S$’ is the sum over one-particle irreducible graphs 
> arranged along a chain. 


Only the first three classes of graphs are actually divergent. Fortunately, this 
is also the set of divergent graphs that can be absorbed into the redefinition of 
physical parameters, the coupling constant, and the electron mass. (The photon 
mass, we shall see, is not renormalized because of gauge invariance.) The one-loop 
graphs that we want particularly to analyze are therefore the electron self-energy 
correction (/), the photon propagator D,,,, and the vertex correction I’,,: 


dk i ig 
= (ON ee HV oy 
ny) = Cie? | Sort iy (7.43) 
r d* p l i 
“TTY = eee 7 v 
HO Go) = (--/e) fstem (v age 7} 
iD) = —i55 + (==) amet (=) 
Rs A = f 3 d‘k 18h p l = I oO 
SE I!) ies. a -g—m™" fm" 


(7.44) 


To begin renormalization theory, we will, as in ¢* theory, sum all possible 
graphs appearing in the complete propagator (Fig. 7.3): 


iS) = iSp(p)+iSel-iSMSr) +> 


l 


fp —mo —X(p) tie i?) 
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Then we make a Taylor expansion of the mass correction (f) around p = m, 
where m is the finite renormalized mass, which is arbitrary: 


D(p) = X(m) + (p — m)d/(m) + Up) (7.46) 


where (pf) ~ O(p — m) and vanishes for p = m. Since mo is divergent and 
arbitrary, we will choose mo and m so that mo cancels the divergence coming from 
X(m). We will choose: 


mo + X(m) =m (7.47) 


(Again, there is a certain ambiguity in how this cancellation takes place.) 
Inserting the value of /(//) into the renormalized electron propagator, we now 
can rearrange terms, just as in the ¢* case, to find: 


i 
yg — mo — Xm) — (p — m)d!(m) — Sp) 
l 


[1 — ©’(m)](p — m) — Xp) + ie 


iSpy) = 


i (7.48) 
p—m—X(p)+ie 
where we have defined: 
m = mot Xm) 
ne <a 
LQ”) = SMU-LVem! =25—) (7.49) 


Since the divergence of the complete propagator S/, is contained within Z>, 
we can remove this term and define the renormalized propagator S’-: 


r(P) = Z2S5(p) (7.50) 


(Throughout this chapter, we will label the divergent, complete propagators like 
S/, with the prime, and the finite, renormalized propagators like $/, with a tilde 
and a prime.) As before, by summing this subclass of diagrams for the electron 
self-energy, we have been able to redefine the mass of the electron, such that the 
physical mass m is actually finite, and also show that the electron propagator is 
multiplicatively renormalized by the factor Z2: 


v= V/Zov0 (7.51) 
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Next, we analyze the photon propagator D,,, in the same way, summing over 
infinite classes of one-particle irreducible graphs: 


D = Dav — Dig Day +> (7.52) 


However, a naive application of the previous methods would generate a renor- 
malized mass for the photon, which would be disastrous, as we insist that the 
physical photon be massless. This is the first example of how gauge invariance 
helps preserve certain properties of the theory to all orders in perturbation the- 
ory. The photon propagator must be gauge invariant, meaning that k,,[]#” = 0. 
This constraint, in turn. allows us to make the following decomposition of the 
second-rank tensor into a scalar quantity II(k): 


TI#”(k?) = (k4#k” — g#”k?)TI(k7) (7.53) 


As before, we now make the infinite summation of corrections to the photon 
propagator: 


Diy = aT Gz — a m1) (7.54) 
We then power expand II(k?) around k* = 0: 
T1(k?) = I1(0) + II(k7) (755) 
Inserting this back into the propagator, we find: 
= — Sues + gauge terms (7.56) 


~ 72f1 + HK] 


where the last term is proportional to k,,k, and hence will be dropped. We have 
also defined: 


] 


4s = Ta 


Th(k2) Ti(k7)[1 + 1L(0)]~! = Z311(k*) (7.57) 


Il 


Then the finite, renormalized propagator D’ can be defined by extracting out 
Z3: 


Di, =Z3D iy (7.58) 
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As before, this allows us to make the following wave function renormalization: 
Ay = VZ3An0 (7.59) 


Finally, we wish to study the effect of renormalization on the electron—photon 
vertex function. We first define the vertex function: 


PCP, p’) ne Ay (p, p’) (7.60) 


After summing all graphs, we find that the vertex graph is infinite, and that we 
can parametrize this divergence by introducing a third infinite quantity, Z;, such 
that: 


ie 
ipo) = zl ul. p’) (7.61) 
or: 


7. Jee 
Yu + Ay(p, p’) = Z, [yw + Au(p, p’)] (7.62) 


As in the case of the electron mass, which was renormalized by the infinite 
quantity 5m, we will also use Z,; to renormalize the coupling constant. The renor- 
malized coupling constant e, as shown above, receives a multiplicative correction 
1/Z, from the divergent part of the vertex graph. However, we also know that we 
have to renormalize the fermion and photon lines. Since there are two fermion 
lines and one photon line attached to each vertex function, we must multiply the 
coupling constant by another factor /Z> /Z3. 

The renormalized coupling constant is therefore: 


e= <at> La (7.63) 
1 


(Later, using what are called Ward—Takahashi identities, we will show that some 
of these renormalization constants are equal, i.e. Z} = Z2; so that the condition 
on the renormalized coupling constant reduces to: e = /Z3€p.) 

As we mentioned earlier, instead of using multiplicative renormalization, we 
could have alternatively used the counterterm approach. This means starting with 
the theory defined with the physical parameters and then computing divergent self- 
energy and vertex corrections. Then we add counterterms to the action which, 
order by order, cancel these divergences. If we then add the renormalized action 
to the counterterms, we will reproduce the bare action. 
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If we start with the action defined with renormalized parameters: 


3 1 = 2 
o> —G Fy + yi d—m)y —epwdy (7.64) 


then the theory produces divergent amplitudes, which can be cancelled order by 
order by adding counterterms to the action. Since the infinities that we encounter 
are identical to the ones we found with the multiplicative renormalization program, 
it is not surprising that the coefficients of these counterterms can be written in 
terms of the same Z’s. A careful analysis yields: 


Ga F+han¥F (7.65) 
where: 


. 1 . 7 _ - 
A= —— — 1)(Fuv)” +(Z2 — IW d — my + Zodmypy — e(Z, — Dw Ap 
(7.66) 
Adding these two terms together, we find: 


1 a, e 
%o = —ZZ3( Fwy + Zoyig — my — Zed Ap (7.67) 
If we change variables to the unrenormalized quantities: 


JZ. 
Ayo = VZ3Ay 


AW Atay Ag 


= 
I 


® 
=) 
tl 


mM = m—édm (7.68) 
then our action becomes the unrenormalized one: 
Z+AF =H (7.69) 


Finally, we would like to clarify the arbitrariness introduced into the theory 
by the subtraction point jz, the point at which we define the masses and coupling 
constants in Eq. (7.27). This ambiguity arose because of the way that finite parts 
were handled when canceling infinities. For example, we recall that the one-loop 
calculation of the photon vacuum polarization graph in Section 6.6, using a cutoff 
A, yielded: 


A2 
T(k2) = = log (7.70) 
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In order to extract out the renormalization constant Z3, we must split II(k2) into two 
parts, a constant part and a momentum-dependent part, as in Eq. (7.55). However, 
there are an infinite number of ways that we can perform this split. We could 
equally well have split the momentum-dependent and momentum-independent 
parts as: 

Th@) = — (ioe iis + log “) = TI(A, w) + i(k, A, w) (7.71) 

37 pe? k2 

where jy is arbitrary. Of course, we have done nothing. We have added and 
subtracted the same term, log p*. However, by adding finite terms to infinite 
quantities, we have conceptually made a significant change by altering the nature 
of the split. With this new split, the photon propagator in Eq. (7.56) can now be 
rewritten as: 


=f 234, LL) 


De (7.72) 
[1 +Ta?, A, w)] 
It is essential to notice that Z3 now has an explicit dependence on pz: 
ZA.) = —z—a 
+ 7 log 
Fk, A,w) = Z3(A, w)TT@?, A, w) (7.73) 


However, since the renormalized coupling constant @ is a function of Z3, we 
can isolate the dependence of a on jz. To do this, we note that the unrenormalized 
Mp is, by definition, independent of ~. Now choose two different renormalization 
points, jz; and jz2. Since a can be written in terms of jz; or 42, we have: 


ay = oe) = _ 22) (7.74) 


~ Z3(A, 1) Z3(A, 142) 
where we have set Z,; = Z2. Substituting in the value of Z3(A, jz), we have: 


[eee 2 g Hi 
(U1) @(u2) 3m © pp 


(7.43) 


To lowest order, we have therefore derived a nontrivial relation, following 
from the renormalization group analysis,>° which expresses how the effective 
coupling constants depend on the point jz, where we define the masses and coupling 
constants via Eq. (7.27). These renormalization group equations will prove crucial 
in our discussion of the asymptotic behavior of gauge theory in Chapter 14, where 
we give a more precise derivation of these relations. They will prove essential in 
demonstrating that QCD is the leading theory of the strong interactions. 
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7.5 ‘Types of Regularization 


This completes a brief sketch of how the renormalization program works for @4 and 
QED. Now comes the difficult part of filling in the essential details of the renormal- 
ization program. We will now concentrate on the regularization schemes, certain 
complications such as the Ward-Takahashi identity and overlapping graphs, and 
then we will present the proof that QED is renormalizable. 

Over the decades, a wide variety of regularization schemes have been devel- 
oped, each with their own distinct advantages and disadvantages. Each regular- 
ization scheme necessarily breaks some feature of the original action: 


1. Pauli-Villars regularization 
Until recently, this was one of the most widely used regularization scheme. 
We cutoff the integrals by assuming the existence of a fictitious particle of 
mass M. The propagator becomes modified by: 


l 1 m2 — M2 


Pom PM omy) “OO 


The relative minus sign in the propagator means that the new particle is a ghost; 
that is, it has negative norm. This means that we have explicitly broken the 
unitarity of the theory. The propagator now behaves as 1/p*, which is usually 
enough to render all graphs finite. Then, we take the limit as M? — oo 
so that the unphysical fermion decouples from the theory. The advantage 
of the Pauli—Villars technique is that it preserves local gauge invariance in 
QED; hence the Ward identities are preserved (although they are broken for 
higher groups). There have been a large number of variations proposed to the 
Pauli—Villars technique, such as higher covariant derivatives in gauge theory 
and higher R? terms in quantum gravity. 


2. Dimensional regularization 

This is perhaps the most versatile and simplest of the recent regularizations. 
Dimensional regularization involves generalizing the action to arbitrary di- 
mension d, where there are regions in complex d space in which the Feynman 
integrals are all finite. Then, as we analytically continue d to four, the Feyn- 
man graphs pick up poles in d space, allowing us to absorb the divergences of 
the theory into the physical parameters. Dimensional regularization obviously 
preserves all properties of the theory that are independent of the dimension of 
space-time, such as the Ward—Takahashi identities. 
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3. Lattice regularization 

This is the most widely used regularization scheme in QCD for nonpertur- 
bative calculations. Here, we assume that space-time is actually a set of 
discrete points arranged in some form of hypercubical array. The lattice 
spacing then serves as the cutoff for the space-time integrals. For QCD, the 
lattice is gauge invariant, but Lorentz invariance is manifestly broken. The 
great advantage of this approach is that, with Monte Carlo techniques, one 
can extract qualitative and some even some quantitative information from 
QCD. One disadvantage with this approach is that it is defined in Euclidean 
space; so we are at present limited to calculating only the static properties of 
QCD; the lattice has difficulty describing Minkowski space quantities, such 
as scattering amplitudes. 


In this chapter, we will mainly stress the dimensional regularization method.’~'’ 
We will now show explicitly, to lowest order only, how dimensional regularization 
can regulate the divergences of the theory so that we are only manipulating finite 
quantities (albeit in an unphysical dimension). Then later we will show how to put 
this all together and renormalize field theory to all orders in perturbation theory. 

Our starting point is the action for scalar mesons and for QED in d dimensions: 


4—d 
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where jz is an arbitrary parameter with the dimension of mass. It is necessary to 
insert this dimensional parameter because the dimension of the fermion field is 
[w] = (d — 1)/2 while the dimension of the boson field is [A,,] = (d/2) — 1. (Our 
final result must be independent of the choice of . In this formalism, yz takes 
the place of the subtraction point introduced earlier in the renormalization-group 
equations.) 

Generalizing space-time to d dimensions, we are interested in evaluating the 
integral in Eq. (7.11): 


pope? dp 1 
) (21)4 p? — m2 +ie 


= =F (7.78) 
Our goal is to find a region in complex d space where this integral is finite, and 
then analytically continue to d = 4, where we expect to pick up poles in divergent 
quantities. Then the renormalization scheme consists of absorbing all these poles 
into the coupling constants and masses of the theory. 
To begin the dimensional regularization process, we first must express the 
integration over d@ p for arbitrary dimension. To calculate the volume integral over 


7.5. Types of Regularization 237 


d space, we remind ourselves that the change from Cartesian to polar coordinates 
in two dimensions is given by: 


x] r cos 0; 


r sind (7.79) 
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In three dimensions, we use the transformation to spherical coordinates given 
by: 


so SP COR 
xX. = rsin@; cos @ 
x3 = rsind@, sin6, (7.80) 


Given these two examples, it is not hard to write down the transformation from 
Cartesian to d-dimensional spherical coordinates, which spans all of d space. (We 
will make a change of variables in the time parameter, converting real time to 
imaginary time. This is called a Wick rotation, which takes us from Minkowski 
space to Euclidean space, where we can perform all d-dimensional integrals all 
at once. We then Wick rotate back to Minkowski coordinates at the end of the 
calculation. Alternatively, we could have done the calculation completely in 
Minkowski space, where the dx° integration is handled differently from the other 
integrations.) The d-dimensional transformation is given by: 


x; = rcosd, 

xX. = rsin@, cos, 

x3 = rsin@, sin@ cos 63 

x4 = rsin@, sin 6 sin 63 cos 04 
kl 

iy) eet eee sin; | cos 6 
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Xq = r| [sing (7.81) 
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By induction, one can prove that the Jacobian from Cartesian to spherical 
coordinates is given by: 


OX, X25°° 4 ea ) 
J = det | ———— 
(Grea 


d=1 
a ial sin'—' 6; (7.82) 
i=l 
The volume element in d space is therefore given by: 


d d—-l 
Wan = Jar | | 46; 
i=l =| 


= r?—!drdQa_} 


d—1 
= r¢— dr I] sin'—! 6;d0; (7.83) 
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To evaluate this integral, we use the fact that: 


m nT (1 
/ sin” 6d@ = ve) (7.84) 
0 T'(5(m + 2)) 


Then in d dimensions, we have: 
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The integration that we want to perform can therefore be written in the form: 


d‘p ria) 
ee ee 
| (peep g me i | "(Pg — my — 


(In the last expression, we have changed coordinates p — p+ q in order to 
eliminate the cross term between p and g. We are also working in Euclidean 
space and will Wick rotate back to Minkowski space later.) The integral over the 
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solid angle is given by: 


ont)? 
[490 = T(d/2) (7.87) 


We are left, therefore, with an integral over r. This integral over r is of the form 
of an Euler Beta function: 


_ TOE) 
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With this formula, we can prove: 


((m rdr ‘(5 +B))T (a — 41 + B)) a 
j (r2 + C2)2 e 2(C2)*-(1+B)/2T(q) (7.89) 
We are left with the integral: 
2 T'(d/2)T (a — d/2 
iad IU SITES Bet) (7.90) 
e229" nena) 
The final result is therefore: 
ae ee ian 2 
/ ———— = ieee (7.91) 
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(The extra i appears when we rotate back from Euclidean space to Minkowski 
space.) 

Given these formulas, let us first analyze the one-loop correction to the ¢* 
theory, which is given by &, and then generalize this discussion for the Dirac 
theory. The one-loop correction in ¢? theory in Eq. (7.78) is given by: 


, 2—d/2 
_ix(p’) = —idomsI (1 — (d/2)) Anu? 
. 321? —m 
idgme 1 
= Ta (gaa) * uae 


where we have extracted the pole term in the [ function as d — 4. These poles in 
d space, as we pointed out earlier, can all be absorbed into the renormalization of 
physical coupling constants and masses. This is our final result for the one-loop, 
two-point divergence to this order. 

To generalize this discussion to the Dirac propagator, we must perform the 
dimensional integral with two Feynman propagators, so there are two terms in the 
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denominator. It is easier if we combine the two factors in the denominator together; 
so we will once again use Feynman’s parameter trick. First, let us calculate the 
electron self-energy correction, which can be written in d-dimensional space as: 


P = d¢k p-K+mo 
_ _+,2,4-d Ub 
aes (2n)4 le = ae : 

a fa of & y- biti we 
- ju (2n)4 CTS k)2x — max +k(1 — x)/? 
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0 
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x [mex — p?x(1 — x)]4/?-? (7.93) 


where we have made a sequence of steps: (1) we used the Feynman parameter 
trick; (2) we made the substitution g = k — px; (3) integrated over g, dropping 
terms linear in g. Finally, we look for the pole in ['(2 — d/2) asd — 4. Let us 
perform the integration over x: 


1 
Lip) dx {2y(1 — x) — 4mo — e[p(1 — x) + mo]} 
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(where y is the Euler—-Mascheroni constant). 
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Comparing this with the expansion for (5) in Eq. (7.49), we now have a 
result for: 


(7.95) 


Next, we evaluate the vacuum polarization contribution to the photon propa- 
gator. Once again, we will use the Feynman parameter trick: 


i = ee [deen ely 
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(7.96) 


where g — g — kx and f(d) = Tr J. The first and third terms on the right- 
hand side of the equation cancel, leaving us with only a logarithmic divergence. 
(Gauge invariance has thus reduced a quadratically divergent graph to only a 
logarithmically divergent one.) Extracting the pole as d — 4, we find: 


2 
€ 
Tay = 5a kuky Serr) 
1 y : me — kx —x) 
ee oe | a eee 
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Using Eq. (7.57), this means that we can write: 


(7.98) 
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Finally, we must write down the dimensionally regularized vertex correction 
dtk yp —K + mo)yu(Y—K + mov” 
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Au(p, 4, Pp’) 


where we have made the substitution k — k — px — p’y. 
This expression, in turn, contains a divergent and a convergent part. The 


convergent part, as before, gives us the contribution to the anomalous magnetic 


moment of the electron. 
From Eq. (7.61), the divergent part can be isolated as: 
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This, in turn, means that we can write: 
z,=1~ (7.101) 
a 87 2e i 


Notice that Z; = Zz, as we expected from the Ward—Takahashi identity. 
In summary, we have, to lowest order, the correction to the electron self-energy, 


the photon propagator, and the vertex function: 
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These, in turn, give us the expression for the renormalization constants: 
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Comparing this result with the Pauli—Villars regularization method as in Eq. 
(6.98), we find that, to lowest order, the divergences are identical if we make the 
following correspondence between the pole | /e and the cutoff A: 


i 
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Thus, to this order, our results are independent of the regularization procedure. 


7.6 Ward-—Takahashi Identities 


When we generalize our discussion to all orders in perturbation theory, our work 
will be vastly simplified by a set of Ward-Takahashi identities,'®'? which reduce 
the number of independent renormalization constants Z;. When we general- 
ize our discussion to include gauge theories in later chapters, we will see that 
Ward—Takahashi identities are an essential ingredient in proving that a theory is 
renormalizable. 

Specifically, we will use the fact that: 


p) 
Au(P, Pp) = “ae (7.105) 


where we have written the vertex correction in terms of the electron self-energy 
correction. 

To prove this and more complicated identities arising from gauge theories, 
the most convenient formalism is the path integral formalism. However, for our 
purposes, we can prove the simplest Ward—Takahashi identities from graphical 
methods. To prove this identity, we observe that: 


a) 1 1 1 
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(To prove this matrix equation, we set up a difference equation and then take the 
limit for small 5p,,.) Now let us use the above identity on an arbitrary graph that 
appears within X( f). In general, we can write U(%) as follows to bring out its p 
dependence: 


0) eo —————- (7.107) 


where the ellipses may represent very complicated integrals that are not important 
to our discussion. Also, there may be a large number of propagators with p 
dependence within /(p), which we suppress for the moment. Now take the 
derivative with respect to p,, and we find: 
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where we must differentiate over all propagators that have a p dependence. The 
term on the right is therefore the sum over all terms that have electron propagators 
containing p dependence. 

The right-hand side of the previous equation is now precisely the graph corre- 
sponding to a vertex correction connected to an external photon line with zero mo- 
mentum. This in turn means that the right-hand side equals I’,,(p, p), as promised. 
The purpose of this rather simple exercise is to prove a relation between renor- 
malization constants. Now let us insert all this into the Ward—Takahashi identity. 
Differentiating the full =(j/) to all orders, we find: 


a) _  @ Z 
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= —Yy,%'(m) ae d.d(P) 
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where we have used Eq. (7.61). 
Comparing terms proportional to the Dirac matrix, we see that the term (Z > r 
1)y, must be proportional to —%’(m), which we calculated earlier to be equal to 


—1+(1/Z2) in Eq. (7.49). 
Comparing these two coefficients of y,,, we easily find: 


Z,=Z2 (7.110) 


which is our desired result. For a complete proof of renormalization, however, 
we need a more powerful version of the Ward—Takahashi identity, rather than the 
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one defined for zero-momentum external photons. We will use the version: 
(p' — p),.T“(p’, p) = [s’F'e) = s5'(p)| (7.111) 


This is easily shown to lowest order, where Se = p-—m. Then the Ward— 
Takahashi identity reduces to the simple expression (p’— p),. v4 = (p’ —-m—p+m). 
We can also show that this expression, in the limit that p — p’, reduces to the 
previous version of the Ward—Takahashi identity. 

This can also be rewritten in terms of the one-particle irreducible graph ©. 
Because S’;' = p — m — X(p), we have: 


(p' — p)pA* = —[X(p') — X(p)] (7.112) 


The proof of this identity is aiso performed by graphical methods. We use the fact 
that: 
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To prove the general form of the Ward—Takahashi identity, we simply use this 
identity everywhere along a fermion line or along a fermion loop. This procedure 
is schematically represented in Figure 7.4. 

Consider a specific member of ©(7)ag and follow the fermion line that connects 
the index a with 6. Then schematically, we have: 


sin= 3 ff (Taga) (7.115) 
i=0 i 


where pn = Po, where we have omitted all momentum integrations, and where a; 
denote the momentum-dependent photon lines that are connected to other fermion 
loops (which are temporarily dropped). The matrices are ordered sequentially. 
Also, the total momenta contained within the various a; must sum to zero. 

Let us do the same for A,,. If we follow the fermion line connecting the two 


fermion ends of A,,, we find: 
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Figure 7.4. The Ward—Takahashi identity is proved by examining the insertion of a photon 
line along a fermion line contained within © and showing that we retrieve A,. 
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The y, is inserted at the rth point along the fermion line, and we sum over all 
possible values of r. In other words, we sum over all possible insertion points of 
Yu along the fermion line. The essential point in the proof is the observation that 
there is a graphical similarity between the vertex correction and the self-energy 
correction. Graphically speaking, if we take all the graphs within © and attach an 
extra photon line along all possible fermion propagators, then the resulting graphs 
are identical to the graphs within A,,. 

To show this rigorously, let us now contract A, with g” and use the identity 
shown previously. Then each term in the rth sum in A,, splits into two parts. We 
now have 27 possible terms, which cancel pairwise because of the minus sign. 
The only terms that do not cancel are the first and last terms: 
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where po = p and p’ = po+q. 
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The last step in the proof is to notice that we could have attached the photon 
leg of momentum q along any closed fermion loop as well. However, then the 
pairwise cancellation is exact, and the contribution of all closed loops is zero. 
This completes the proof of the Ward—Takahashi identity to all orders. 


7.7 Overlapping Divergences 


As we mentioned earlier, the renormalization program either cuts off the diver- 
gences with counterterms or absorbs them by multiplicative renormalization. In 
order to study how this is done systematically, we can use the method of skeletons. 
Draw a box around each divergent electron and photon self-energy graph and each 
vertex graph. Then we can replace the self-energy insertions with a line and the 
vertex insertions with a point. In this way, we obtain a reduced graph. Then we 
repeat this process, drawing boxes around the self-energy and vertex insertions 
in the reduced graph, and reduce it once more. Eventually, we obtain a graph 
that can no longer be reduced; that is, it is irreducible. The final graph after all 
these reductions is called a skeleton. An irreducible graph is one that is its own 
skeleton. 

The advantage of introducing this concept is that the renormalization program 
is reduced to canceling the divergences within each box. For example, consider 
the complete vertex function [’,, summed to all orders. If we make the reduction 
of I, we wind up with a sum over skeleton graphs, such that each line in the 
skeleton corresponds to the proper self-energy graph S;. and Ds and each vertex 
is the proper vertex. This can be summarized as: 


Pie, P) =, + Ao (S sen (7.118) 


In this way, all complete vertex functions can be written entirely in terms of 
skeleton graphs over proper self-energy graphs and proper vertices. 

This process of drawing boxes around the self-energy insertions and vertex 
insertions is unambiguous as long as the boxes are disjoint or nested; that is, 
the smaller boxes lie wholly within the larger one. Then the skeleton graph is 
unique. However, the skeleton reduction that we have described has an ambiguity 
that infests self-energy diagrams. For example, in Figure 7.5, we see examples of 
overlapping diagrams where the boxes overlap and the skeleton is not unique. One 
can show that these overlapping divergences occur for vertex insertions within 
self-energy parts. Unless these overlapping divergences appearing in self-energy 
graphs are handled correctly, we will overcount the number of graphs. These 
overlapping divergences are the only real difficulty in proving the renormalization 


of ¢* and QED. 
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Figure 7.5. Examples of overlapping divergences found in QED. 


There are several solutions to this delicate problem of overlapping divergences. 
The cleanest and most powerful is the BPHZ program, which we will present in 
Chapter 13. For QED, the most direct solution was originally given by Ward, who 
used the fact that, although /() has overlapping divergences, the vertex function 
A, does not. Therefore, by using the Ward—Takahashi identity: 


aD(p) _ 
op" 


=A (7.119) 


we can reduce all calculations to functions in which overlapping divergences are 
absent. 

Mathematically, taking the derivative with respect to p* is identical to inserting 
a zero photon line at every electron propagator, as we saw in Eq. (7.106). This 
means that every time an electron propagator appears in ©, we replace it with 
two electron propagators sandwiching a zero momentum photon insertion matrix 
y“. (However, we stress that there are still some serious, unresolved questions 
concerning overlapping divergences that emerge in 14th order diagrams, which 
we will discuss later.) 

To see how this actually works, let us take the derivative of an overlapping 
divergence appearing in ©. In Figure 7.6, we see the effect that 0/dp,, has on the 
overlapping divergence. 

A single overlapping divergence has now split up into three pieces, each of 
which has a well-defined skeleton decomposition. Therefore, the Ward—Takahashi 
identity has helped to reduce the overlapping divergences within © to the more 


Figure 7.6. The effect that taking the partial derivative with respect to p, has on the 
overlapping divergence. 
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Figure 7.7. Taking the derivative of the photon self-energy graph with respect to the 
momentum yields these diagrams. 


manageable problem of calculating the skeleton decomposition of the vertex A,,. 
As long as we reduce all electron self-energy ambiguities to vertex insertions via 
the Ward—Takahashi identity, the overlapping divergence problem is apparently 
solved. 

For photon self-energy graphs, there is also an overlapping divergence prob- 
lem, which is also treated in much the same way. Although there is no Ward— 
Takahashi identity for the photon self-energy graph, we can solve the problem by 
introducing a new function W,,, which is defined by: 


ay. =I 
Wilk) = a7 [iD’(k?)] (7.120) 
where: 
: | 
D'(k*) = - ———_.—____~ A121) 
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More explicitly, we can write: 
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The whole point of introducing this function W,,(k) is that it has no overlapping 
divergences. Once again, the net effect of the partial derivative is to convert the 
photon self-energy graph into a vertex graph for T, which has no overlapping 
divergences. To see how this works, in Figure 7.7 we have performed the differ- 
entiation of an overlapping photon self-energy graph and created a vertex graph 
that has no ambiguous skeleton decomposition. 

In summary, as long as we properly replace all electron and photon self- 
energy graphs with their vertex counterparts, there is usually no need to worry 
about overlapping divergences. 
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7.8 Renormalization of QED 


Over the years, a large number of renormalization programs have been developed, 
with various degrees of rigor, and each with their own advantages and disad- 
vantages. Since the reader will encounter one or more of these renormalization 
programs in his or her research, it is important that the reader be familiar with 
a few of these approaches. In this book, we will present three renormalization 
proofs: 


1. The original Dyson/Ward proof, (which apparently breaks down at the four- 
teenth order, due to ovelapping divergences). 


2. The BPHZ proof. 


3. Proof based on the Callan—Symanzik equations. 


We will first present the Dyson/Ward proof, which uses the auxiliary function W,, to 
handle the overlapping divergence problem. In a later chapter, we will renormalize 
quantum field theory using more modern and sophisticated techniques, such as 
the renormalization group and the BPHZ program, which are more general than 
the proof that we will present here. 

The Dyson/Ward proof of renormalization can be described in four steps: 


1. First write down the complete set of coupled equations containing all the 
divergences of the theory for all self-energies and vertices. Everything is 
written in terms of skeleton graphs defined over the (divergent) complete 
propagators S$; and D/,,. 


2. Subtract off the infinite divergences only in the vertex functions T and Nis 
which are free of overlapping divergences. Then define the new renormalized 
self-energy functions Sj, and D/,, in terms of these subtracted vertex parts. 
This will enable us to write down an equivalent set of coupled equations for 


the finite set of self-energy functions free of overlapping divergences. 


3. Rewrite the subtraction process as a multiplicative rescaling of the vertex and 
self-energy parts. 


4. Absorb all multiplicative renormalizations into the coupling constant, masses, 
and wave functions. 


7.8.1 Step One 


The proof begins by writing down the expression for the unrenormalized self- 
energy graphs and vertices, summed to all orders, in terms of their skeletons. It 
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will prove useful to summarize the notation that we will use: 


u(p) > proper e self-energy graph 
II,, — proper y self-energy graph 


li — proper vertex (unrenormalized) 


Sr — complete e self-energy graph 
; (72123) 
Div — complete y self-energy graph 
Sj; | — renormalized e self-energy graph = Z;'S’, 
Di, — renormalized y self-energy graph = Z, ‘Di 
Tr, | — renormalized vertex = ZI’, 
Then the unrenormalized self-energy graphs and vertices satisfy: 
Ci(p, p’) = Vivek (S., De | ee e. dD, p’) 
Wak) = 2k, + tks’, D’ Ty Week) 
Sp)! = S'(po)~' +(p — po)*T (Pp, Po) 
| 
D'(k*)"! = / dx k" W,,(xk) (7.124) 
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where the superscript S denotes skeleton graphs, the prime denotes the complete 
propagator, which is the sum over divergent one-particle irreducible graphs. 

The first two equations are really definitions, telling us that the vertex graphs 
I, and W,, can be written in terms of skeleton graphs. (Because they have no 
overlapping divergences, this can always be done.) The third equation is the 
Ward-Takahashi identity, and the last equation is an integrated version of the 
definition of W,,. Since these equations are all divergent, we must make the 
transition to the renormalized quantities. 


7.8.2 Step Two 


To find the convergent set of equations, we want to perform a subtraction only 
on those functions that have no overlapping divergences; that is, we perform the 
subtraction on T and A. In this way, we remove the possibility of overcounting. 
The subtracted functions are denoted by a tilde: 


Au@ py = ip. p)—4,@0, Pol, 
Toe TeGD-T Wve) (7.125) 
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Only one subtraction is necessary since the diagrams are only logarithmically 
divergent. (As before, we stress that there is an infinite degree of freedom in 
choosing 2, the subtraction point. Whatever value of 4. we choose, we demand 
that the physics be independent of the choice.) 

We note that AS( Po, Po) can be further reduced. Since it is defined foro = yu, 
it can only be a function of y,,. Thus, we can also write: 


Aj (Po> Po)| yay = L¥u (7.126) 


where L is divergent. 

The important point is that we have only performed the subtractions on the 
quantities that have no overlapping divergences and no overcounting ambiguities, 
that is, the vertex functions T and A. This, in turn, allows us to define the finite 
self-energy parts (with a tilde) via the following equations: 


Pee = ee GS, DU epee) 
Wik) = 2ik, +ik, T(S, D,T, W, e?, k?) 
Sip)! = Spo)” + (p — po T u(p, Po) 


D(k?) 


1 
| dx k* W,,(xk) (7.127) 
0 


The advantage of these definitions is that everything is now defined in terms of T 
and A, which have no overlapping divergences. However, the quantities [',,, W,, 
were simply defined by the previous equations. We still have no indication that 
these subtracted functions have any relationship to the actual renormalized self- 
energy and vertex parts. 


7.8.3 Step Three 


Now comes the important step. Up to now, we have made many rather arbitrary 
definitions that, as yet, have no physical content. We must now show that these 
quantities with the tilde are, indeed, the renormalized quantities that we want. 

To see how this emerges, let us focus on the vertex graph. In terms of the 
subtraction, we can write: 


Ji = yt, 
= yy +A5 —Ly, 


1 
(1 = /L,); @ SP —) 


7.8. Renormalization of QED 253 
lags 
7 @ - At) (7.128) 
where we have defined: 
Zi (7.129) 


Although we have factored out the renormalization constant Z, from this equation, 
the right-hand side is still not in the correct form. We want the right-hand side to 
be written in terms of the unrenormalized quantities, not the renormalized ones. 

To find the scaling properties, it is useful to write the vertex functions symbol- 
ically in terms of propagators S$ and D and vertices y,,. Symbolically, by deleting 
integrals, traces, etc., the vertices can be written as products of propagators and 
vertices: 


Aa Pen enn ODIO) wales 


(ea ~ ee S52 (Dy "(yaa igs? Gp 130) 
where o is an integer that increases by one for every differentiation of an electron 
line. 

Given this symbolic decomposition, we want to study their behavior under a 


scaling given by: 


SN laa Ss (7.131) 


(where a and b are proportional to renormalization constants). 
Then it is easy to show that the vertex functions scale as: 


SG Ss, bD, ay us b-'é, P, P) 


ah5(S, D, Yus €> Ps P’) 
T'@"'S,/ BD al .0m tio cegek ) 
(7.132) 


Be TS, Dy ye 2ike eg, 


With these rescaling relations, we can now absorb the factor Z, back into the 
renormalized vertex function AS and convert it into an unrenormalized one. Let 
a= Ze and b = Z3. We then have: 


= 1 eer 
tr = Z% (n+ 70k. D.f.<,p.p) 
1 
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Zi [yu +A Zs, ZaDeZ We 2, Pe) 

Zi yu + A oD Pep. P,P 

= Zi, (7.133) 


where we have used Z; = Z2. This is the result that we wanted. We have now 
shown that the subtracted quantity I’,,, after a rescaling by Z; and Z3, can be 
written multiplicatively in terms of the unrenormalized quantity [’,,. This justified 
the original definition of [',, that we introduced earlier. 


7.8.4 Step Four 


Now that we have renormalized the vertex, the rest is now easy. The vertex W,, 


can now be renormalized in the same way. With a = Z;' and b = Z3, we have: 


W,(k) Qiky + iky (T*(k’) — T5(u’)) 
- A Sian2 . 1 . Sipe 
(: af (uu ) (24 | =e Tsay et (k ) 


Z3 (2ik, - | ie, 75 «)) 
23 


Za [Qik tikhyl ZS, 24D eZ ws 2 ee ck) | 
Z, [2ik, + ik, T*(S’, D',T, W, €, )] (7.134) 


Thus, we have the other renormalized relation: 
W, =Z3W, (7135) 


From the renormalization of these vertex functions, it is easy to renormalize 
the self-energy terms as well, since everything is multiplicative. We now have the 
following relations that show the link between renormalized and unrenormalized 
quantities: 


[i =a Zee Wye =" Z; We 
Soo = ZS D = Z., (7.136) 
en = Zanes Z| = Z2 


In summary, we have proved that a subtraction of the divergences can be 
re-expressed in terms of a multiplicative rescaling of the vertex and self-energy 
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Figure 7.8. This fourteenth-order photon self-energy Feynman diagram was shown by 
Yang and Mills to suffer from a serious overlapping divergence problem, because of 
ambiguities introduced when we take the momentum derivative. This graph apparently 
invalidates the original Dyson/Ward renormalization proof. 


parts. This means that all divergences are multiplicative and can be absorbed into 
a renormalization of the coupling constants, masses, and wave functions. 

Although this was thought to be the first complete proof of the renormaliz- 
ability of QED, this proof may be questioned in terms of its rigor. For example, 
it was shown by Yang and Mills*°? that there is an overlapping ambiguity at the 
fourteenth order in QED, thereby ruining this proof. For the electron self-energy, 
the Ward—Takahashi identity solves the problem of overlapping divergences, but 
for the photon self-energy, they showed that the operation of taking a momentum 
derivative is ambiguous at that level, thereby invalidating the proof (Fig. 7.8). 
(They also showed how it might be possible to remedy this problem, but did not 
complete this step.) 

Second, another criticism of this proof is that we necessarily had to manipulate 
functions that were sums of an infinite number of graphs. Although each graph 
may be finite, the sum certainly is not, because QED certainly diverges when we 
sum over all orders; that is, it is an asymptotic theory, not a convergent one. More 
specifically, what we want is a theory based on an induction process, such that at 
any finite order, all functions are manifestly finite. We need a functional equation 
that allows one to calculate all self-energy and vertex parts at the n + Ist level 
when we are given these functions at the nth level. 

Since there are many functional equations that link the mth-order functions 
to the n + Ist-order functions, there are also many inductive schemes that can 
renormalize field theory.7*4 One of the most useful of these schemes is the 
inductive process using the BPHZ and renormalization group equations (which 
will be discussed in further detail in Chapters 13 and 14). 

In summary, we have seen that renormalization theory gives us a solution to 
the ultraviolet divergence problem in quantum field theory. The renormalization 
program proceeded in several steps. First, by power counting, we isolated the 
divergences of all graphs, which must be a simple function of the number of 
external lines. Second, we regulated these divergences via cutoff or dimensional 
regularization. Third, we showed that these divergences can be canceled either by 
adding counterterms to the action, or by absorbing them into multiplicative renor- 
malizations of the physical parameters. Finally, we showed that all divergences 
can be absorbed in this way. 
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The weakness of our proof, however, is that it handles overlapping divergences 
in an awkward (possibly incorrect) way, and that it is not general enough to handle 
different kinds of field theories. It remains to be seen if the overlapping divergence 
problem can be truly solved in this formalism. In Chapters 13 and 14, we will 
present the BPHZ and renormalization group proofs of renormalization, which are 
not plagued by overlapping divergences and are much more versatile. 

This completes our discussion of QED. Next, in Part II we will discuss the 
Standard Model. 


7.9 Exercises 


1. Show that Z; = Z to one-loop order for the electromagnetic field coupled to 
a triplet of 2 mesons, where Z, is the vertex renormalization constant, and 
Z> is for the 2 self-energy. (Hint: use the Ward—Takahashi identity.) 


2. Do a power counting analysis of ¢” in q dimensions; isolate the graphs that 
are divergent. Confirm the statements made in the text concerning the renor- 
malizability or super-renormalizability of the theory in various dimensions. 


3. Draw all possible graphs necessary to prove the Ward—Takahashi identity for 
QED to order @?. 


4. Do a power counting of the massive Yukawa theory, with interaction term 
ww. Isolate all divergent graphs. Show that all divergences can be, in 
principle, moved into the physical parameters of the system. Outline the 
renormalization program. 


5. Analyze the renormalization properties of a derivative coupling theory: 


avy pan (7.137) 


Is the S matrix equal to one? Is the theory trivial? Consider making a 
field redefinition on the yw and w. Calculate the self-energy correction to the 
fermion propagator and verify your conjecture. 


6. Consider the one-loop electron self-energy diagram in QED. Let this electron 
also interact with an external scalar field via a derivative coupling, as in the 
previous problem. To first order in g, attach this derivative coupling term 
in all possible places along each electron propagator. There are three such 
graphs. Show that the sum of these three graphs is zero. 


7. Repeat the previous problem, except consider an electron line with all possible 
photon lines attached to it to all orders in e. To first order in g, attach the 
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10. 
ll. 


12. 


derivative coupling interaction along all electron propagators and show that 
this also sums to zero. 


. Consider the one-loop correction to the propagator in a massive ¢° theory. 


Calculate the one-loop correction using both the Pauli—Villars method and the 
dimensional regularization method in order to find a relationship between ¢€ 
and A/m. 


. Prove Eq. (7.91). Fill in the missing steps in its derivation that were omitted 


in the text. 
Fill in the missing steps in Eqs. (7.93), (7.94), (7.96), and (7.97). 


Let Kagys equal the complete four-electron Green’s function, where the Greek 
letters label the Dirac spinor indices. From this, we can construct what are 
called the Schwinger—Dyson equations for the electron vertex, with electron 
momenta p and p’: 


= d*q ¢- . 
Pulp’, Pys = Ziudys - | FE S0' satu! +4. p+ 4) 
x Sp(p +q)] 4, Kapsy(P +9, P’ +9,9) (7.138) 


and for the photon propagator: 


d*k 2 2 zs 
ILhv(g) =iZ ! Ome [ye Sp(AD Vk, k + q)Selk + q)| (7.139) 


Graphically, write down what these recursion relations look like. Then show 
that they are graphically correct to two-loop order. 


Show that K does not suffer from overlapping divergences, which means that 
the Schwinger—-Dyson equations (instead of the Ward—Takahashi identities) 
may be used to renormalize QED to all orders.”*.™ 


Bart i 


Gauge Theory 
and the Standard Model 


Chapter 8 
Path Integrals 


One feels as Cavalieri must have felt calculating the volume of a pyramid 
before the invention of calculus. 
—R. Feynman 


8.1 Postulates of Quantum Mechanics 


Previously, we outlined how to quantize field theories with various spins using the 
canonical quantization approach. However, for increasingly complex systems, 
such as gauge theory, quantum gravity, and superstring theory, canonical quanti- 
zation proves to be a very clumsy formalism since manifest Lorentz invariance is 
broken. Instead, we will explore a new method in this chapter. 

Perhaps the most powerful quantization method is the path integral approach, 
which was developed by Feynman,' based on an idea of Dirac.* The path integral 
method is versatile enough to handle a variety of different types of gauge theories. 
The path integral approach has many advantages over the other techniques: 


1. The path integral formalism yields a simple, covariant quantization of com- 
plicated systems with constraints, such as gauge theories. While calculations 
with the canonical approach are often prohibitively tedious, the path integral 
approach yields the results rather simply, vastly reducing the amount of work. 


2. The path integral formalism allows one to go easily back and forth between the 
other formalisms, such as the canonical or the various covariant approaches. In 
the path integral approach, these various formalisms are nothing but different 


choices of gauge. 


3. The path integral formalism is based intuitively on the fundamental principles 
of quantum mechanics. Quantization prescriptions, which may seem rather 
arbitrary in the operator formalism, have a simple physical interpretation in 
the path integral formalism. 
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4, The path integral formalism can be used to calculate nonperturbative as well 
as perturbative results. 


5. The path integral formalism is based on c-number fields, rather than g-number 
operators. Hence, the formalism is much easier to manipulate. 


6. At present, there are a few complex systems with constraints that can only be 
quantized in the path integral formalism. 


7. Renormalization theory is much easier to express in terms of path integrals. 


Our discussion of the path integral formalism begins with two deceptively 
simple principles: 


8.1.1 Postulate I 


The probability P(b, a) of a particle moving from point a to point b is the square 
of the absolute value of a complex number, the transition function K (5, a): 


P(b, a) = |K(b, a)|’ (8.1) 


8.1.2 Postulate I 


The transition function K(b, a) is given by the sum of a phase factor e'’/", where 
S is the action, taken over all possible paths from a to b: 
Kibia)= ) ke (8.2) 
paths 
where the constant k can be determined by: 
K(c,a) =) K(c, b)K(b, a) (8.3) 


paths 


where we sum over all intermediate points b connecting a and c. 

These postulates incorporate the essence of the celebrated double slit exper- 
iment, where a beam of electrons passes through a barrier with two small holes. 
A screen is placed behind the barrier to detect the presence of the electrons. As a 
point particle, an electron cannot, of course, go through both holes simultaneously. 
Classically, therefore, we expect that the electrons will go through one slit or the 
other, leaving two distinct marks on the screen just behind the two holes. 

However, experiments show that the pattern created on the screen by repeated 
passages of the electrons through these holes is an interference pattern, associated 
with wave-like, not particle-like, behavior. Classically, we are therefore left with 
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a paradox. A point particle cannot go through both holes at once, yet the passage 
of a large number of electrons successively going past the barrier clearly leaves an 
interference pattern, with minima and maxima, as if the electron somehow went 
through both holes. 

In the path integral approach, as in quantum mechanics, this puzzle can be 
resolved. The postulates of the path integral approach and quantum mechanics 
do not allow us to calculate the precise motion of a single point particle. They 
only allow us to calculate probability amplitudes. The probability that an electron 
will go from the source past the slits to the screen is given by summing over all 
possible paths. These probabilities, in turn, may have wave-like behavior, even if 
the electron itself is a point particle. 

The sum over paths reproduces the interference pattern that is experimentally 
seen on the screen. Thus, the path integral approach incorporates the philosophy 
behind the double-slit experiment, which, in turn, embodies the essence of the 
quantum principle. 

As in quantum mechanics, we make the transition to classical mechanics by 
taking the limit — 0. For large values of S, the exponential of iS /h undergoes 
large fluctuations, and hence cancels out to zero. Hence, the contribution of the 
paths that maximize the action S do not contribute much to the sum over paths: 


bSo>h: els" ~0 (8.4) 
paths 


In the classical limit, the paths that dominate the sum are the ones where 6S/h 
is as small as possible. However, the path for which 6S is minimized is just the 
classical path: 


56$=0  — classical mechanics (8.5) 


Thus, we recover classical mechanics in the limit ash — 0. The picture that 
emerges from the path integral approach is therefore intuitively identical to the 
principles of quantum mechanics. To calculate the probability that a particle at 
point a goes to a point b, one must sum over all possible paths connecting these 
two points, including the classical one. The path preferred by classical mechanics 
is the one that minimizes the action for 5S <h (Fig. 8.1). 

Although the path integral method gives us an elegant formalism in which to 
reformulate all of quantum field theory, one should also point out the potential 
drawbacks of the formalism. One problem is that the path integral is not well 
defined in Minkowski space. In this chapter, we will assume that all path integrals 
are computed with the Euclidean metric. Then, the functional integral is taken 
over e~5, which has much better convergence properties than integrals over e! ee 
At the end of the calculation, we assume that we can analytically continue back 
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Figure 8.1. The path integral sums over all possible paths connecting two points, including 
the one favored by classical mechanics. In this way, the path integral sums over quantum 
corrections to classical mechanics. 


to Minkowski space. (The question of whether this analytic continuation from 
Minkowski space to Euclidean space and back again is rigorously defined is a 
highly nontrivial question. This is a delicate matter, the subject of a field called 
axiomatic field theory, which is beyond the scope of this book.) 

Another problem is that the transition between c numbers and operators be- 
comes illdefined when the Hamiltonian has ordering problems. The path integral 
over a system with the Hamiltonian of the form p’ f(q), for example, becomes 
ambiguous when making the transition to the operator language, since p and q 
do not commute. For systems more complex than the harmonic oscillator, the 
integrals may not be Gaussian, and ordering problems may creep into the path 
integral. For complicated systems, one must often use “point splitting” methods, 
that is, separating two fields by a small infinitesimal amount in space-time in order 
to regularize the integrals. Unfortunately, a detailed elaboration of these delicate 
points is also beyond the scope of this book. 

With these problems in mind, let us now compute with the path integral 
approach. We first divide up a path by discretizing space-time. Let us divide up 
each path in three-space into N points (Fig. 8.2). Then the “sum over all paths” 
can be transformed into a functional integral: 


N 
> = Jim TTT [ox — f px (8.6) 


3 
paths i=] n=] 
The integral f Dx is not an ordinary integral. It is actually an infinite product of 


integrals, taken over all possible dx(t). Whenever we use the differential symbol 
D, we should remember that it is actually an infinite product of differentials taken 
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3) 
a) 


Figure 8.2. To calculate with path integrals, we break the path into a discrete number of 
intermediate points, and then integrate over the position of these intermediate points. 


s 


over all points. In this functional language, the transition function becomes: 
b . 
K(b, a) =k / Dye? (8.7) 
a 
where k can be determined as follows: 
K(c,a)= / K(c, b)K(b, a) Dxp (8.8) 


where we integrate over all possible intermediate points x, which link points a 


and c. 

To give this approach some substance, let us begin with the simplest of all pos- 
sible classical systems, the free nonrelativistic point particle in the first quantized 
formalism. Our discussion begins with the classical action: 


S= / dt smi? (8.9) 


Let us now discretize the paths. We take dt to be a small interval € and then 
discretize the Lagrangian: 


dt — € 


1 
<mi?dt — 5M(Xn — nei) e| (8.10) 
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The transition function K(b, a) can be written as the path integral over e'’: 


K(b,a) = tim ff > f axa dxy---dxy-. 
im 2 
x kexp Je D(a — Aa (8.11) 


Unfortunately, one of the drawbacks of the path integral formalism is that 
embarrassingly few functional integrals can actually be performed. However, we 
will find that the simplest Gaussian path integral is also the one most frequently 
found for free systems. Specifically, we will repeatedly use the Gaussian integral: 


ee Tin+3 
i qeyhe tt 2) am (8.12) 


2n+1 
ES r 


To evaluate the expression for K(b, a), we now perform one of these Gaussian 
integrations: 


/ dxz exp [—a(x ioe - a(x2 — x3) 


iS) 


1 
= [= exp [sate ~ »)| (8.13) 


The key point is that the Gaussian integral over x2 has left us with another 
Gaussian integral over the remaining variables. This process can be repeated an 
arbitrarily large number of times: Each time we perform a Gaussian integral on 
an intermediate point, we find a Gaussian integral among the remaining variables. 

After repeated integrations, we find: 


il dxz -+:dxy_ exp [—a(xy _ xo) —+++—a(xyn_} — xn) | 
= fee g 2 8.14 
> 4 (ey xp Woe! — xn) (8.14) 


This, in turn, allows us to calculate the constant k: 


mie a 
k=( ) (8.15) 


m 
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If we take the limit as the number of intermediate points goes to infinity, then we 
are left with the final result for the transition function: 


m 


1/2 : 2 
(1/2)im(@&p — Xa) 
Qn (th —t,)| 


8.16 
ay (8.16) 


K(b,a)=| 


This is a pleasant result. This is exactly the Green’s function we derived in Eq. 
(3.58) for nonrelativistic quantum mechanics. Beginning with only the postu- 
lates of the path integral approach and the simplest possible classical action, we 
have derived the Green’s function found in quantum mechanics that propagates 
Schrédinger waves. It obeys the equation: 


a ce aye Kb ) (8.17) 
Imax ee 


Our first exercise in the path integral formalism gave us encouraging results. 
Now let us tackle more general and more difficult problems, such as (1) the transi- 
tion between the Lagrangian and the Hamiltonian approaches and (2) the transition 
from c-number expressions to operator g-number expressions. In the usual canon- 
ical approach, these two transitions appear rather ad hoc and counterintuitive. 

In the path integral approach, the transition between the Lagrangian and Hamil- 
tonian systems is easily performed by adding an infinite sequence of Gaussian 
integrations for the momentum p;. For each infinitesimal integration, we use the 


fact that: 
[- dp eiap tibp = im ib? /4a (8.18) 
ae Va 


which can be proved by completing the square. If we let a = —1/2m and b = x 
and integrate over an infinite number of these momenta, then we have: 


Xp tb il 
| Dx expi [ dt | 5mmcan? - vo] 
Xa ta 
Xp th pe 
/ Dx Dp expi | dt (ps —-—_ — vin) (8.19) 
2 " 2m 


The Lagrangian appears on the first line, but the Hamiltonian, defined by 
H(p, x) = p?/2m + V(x), appears on the second line. By performing the func- 
tional integral over Dp, we can go back and forth between the Lagrangian and 
Hamiltonian formalisms. In the path integral formalism, the relationship between 
the Lagrangian and the Hamiltonian formalism is no mystery, but simply the 
byproduct of performing an additional functional integration over momentum. 


K(b, a) 


i 
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(For clarity, because normalization factors, such as 1/27, appear repeatedly 
throughout our discussion, we have absorbed them into the definition of Dx and 
Dy. We will henceforth drop these trivial normalization factors, since they can 
can always be explicitly written out later.) Thus, in the path integral formalism, 
the difference between the two formalisms only lies in a Gaussian integration 
over momentum. The path integral formalism allows us to go between these 
formalisms with ease: 


2 
La tate)? vi remorieane amanneey (8.20) 
2 2m 

So far, everything has been defined in terms of c-number expressions. Opera- 
tors, which are the basis of the canonical approach, do not enter into the picture at 
all. Now, let us make the second transition, this time from the path integral formal- 
ism to the operator formalism, to show that the operator formalism that we have 
patiently developed in Chapters 3 and 4 is nothing but a specific representation of 
the path integral. 

We recall that in the canonical formalism, the starting point was the canonical 
equal-time commutation relation between fields (x) and their conjugates (x). 
Only later could we calculate the propagators and finally the S matrix. In the 
path integral formalism, the sequence is roughly the reverse. We begin with the 
S matrix as the starting point, and we later derive the operator formalism as a 
consequence. 

To see how operators naturally emerge in a formalism defined entirely without 
operators, let us write the transition function between point x, at time f, to point xy 
at time ft, in the Heisenberg representation. We will carefully divide the path into 
N intermediate points. In this formalism, the transition probability of a particle 
at point x; and time ¢; going to xy and time ty is given by the matrix element 
between eigenstates |x, f) 

The Heisenberg representation, we recall, is based on a complete set of position 
eigenstates |x) of the position operator ¢, which is now treated as an operator with 
eigenvalue x: 


cago. 3) (8.21) 
We also introduce eigenstates of the momentum operator p: 


in axe 


i |p) dp (p| (8.22) 


= 
{ 
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such that they are normalized as follows: 


(xly) = d(x —y) 
e!Px e7 Px 
= —,; —— a: 
ee) ee - 


To check the consistency of this normalization, we perform the following manip- 
ulations: 


(xly) = (alp) i dp(ply) 


e PX eiPy 


V2n V2x 


dp 


Il 
—, 
ine) 
ES 

Las} 

| 
a 
= 

( 
3 


(8.24) 


I 
j=) 
~ 
= 
| 
‘< 
YS 


Our normalizations are thus consistent. 

Our task is now to rewrite the functional integration over p at an intermediate 
point along the path in terms of an operator expression defined in the Heisenberg 
picture. We will use the fact that the transition element between two neighboring 
points can be written as: 


(alge 2) x) = Gage ley. 2) (8.25) 


Let us take a specific value of x ~ (x; — x2)ét and dp that appears within the 
functional integral and carefully rewrite the integral over dp and its integrand as 
follows: 


dp gi(pPi—Hx.p)et = dp eiH, p)dt oip(ai—x2) 
27 20 


e tH, 0x)8t p—ix2p dp eiPx 


van 


eH 808 (xalp) f dp (p\x1) 


ef H(,a;)5t (x2|x1) 


Gale ee tes x1) 


(x2, to|x1, t1) (8.26) 
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We have now made the transition between a Lagrangian defined in terms of x and 
x and a Hamiltonian defined in terms of x and its derivative 0,. The transition 
was made possible because the derivative of the exponential brings down a p: 


ay eiPx 2 ip eiPx 


oe fH(«.4.)8t gipx = g tH (x, p)8t pipx (8.27) 


In the path integral formalism, this is the origin of the transition between c numbers 
and g-number operators; that is, the insertion of intermediate states defined in p 
space allows us to replace the p variable with a 0, operator. Thus, we have made 
the transition between: 


H(x,p) -— H(,d,;) 
po -i— (8.28) 
In summary, we have now shown that the path integral formalism can express 
the propagator K (b, a) in three different ways, in the Lagrangian or Hamiltonian 


formalism, or in the operator formalism in the Heisenberg picture. This can be 
summarized by the following identity: 


KV) = Get xa) 


(xn, tw|xn-1, tn—1) [ana (xn-1, ty-1 


Pale tp) [ ale to|x1, t1) 


fo exp Ge dt L(.4)) 


[ oxpp exp ( if : dt (px — H(x, p») (8.29) 


ty 


Finally, let us reanalyze, from the point of view of path integrals, how the time 
ordering operator T enters into the propagator. Let us analyze the matrix element 
of an initial state |x;, ¢;) with a final state (x, t,|, with the operators x j(tj) and 
X,(t,) sandwiched between them. We will assume that t j > t. As before, we 
will take time slices and insert a series of complete intermediate states between 
the states at each slice: 


Ce ty |x(t;) x(t) |x1, ty) = (Xa Peat, ti) coe th—1 
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© i inet) f aexanrtera, piel aapantig lt) 
- >We, ther) [ exuurlnn, teat [XK th) x(t) 
+++ |x2, fb) [ axer(a to|x1, t1) (8.30) 
Taking the limit as the number of time slices goes to infinity, we have: 
Cn tnx xi.h) = / Dx Dp x(tj)x(t) 


x exp (Gf (px — H(p, dr) 


if Dx x(t;)x(ty) expi ( i “L(x, xr) 


(8.31) 


Now, let us reverse the order of the times, such that t; < t&. In this case, 
the previous formula must be modified because we can no longer take time 
slices. Thus, whenever t; < t,, we cannot make the transition from operators 
to path integrals unless we reverse the ordering of the operators. In order for this 
formalism to make sense, we will always reverse the order of the operators, such 
that the later times always appear to the left, so that we can proceed with taking 
time slices. To enforce this condition, we must use the time ordered product in 
this case: 


(xapeites| tale, x (te) | |xa5 ta) 


= i Dx Dp x(t;)x(t,) exp (i i : dt (px — H(p, ») (8.32) 


ty 


For a large number of insertions, we have obviously: 
(Xn ta|T [2(t;)x(te) +++ x(tm)] |x1, t1) 


= / Dx Dp x(tj)x(tk) ++ -X(tm) eXP ( / , [p-x—- H(p.21) (8.33) 


We emphasize that the left-hand side consists of operators, so the ordering of the 
times is important. However, on the right-hand side we have a c-number expres- 
sion, where the ordering of the x(t;) makes no difference. The correspondence 
between operators and these c-number expressions in the path integral only holds 
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when we can make time slices, that is, when the operators are time ordered. From 
the path integral point of view, this is the origin of the time ordering in the matrix 
elements. 


8.2 Derivation of the Schrédinger Equation 


In a first quantized formalism, where the action is a function of x! and not fields, 
the path integral formalism gives us an added bonus: It gives us a derivation of the 
Schrédinger equation. Usually, introductory courses in quantum mechanics begin 
by postulating the Schrédinger wave equation. Certain conventions, such as the 
quantization of x and p, seem rather arbitrary. Only later emerges the probabilistic 
interpretation. Here, we reverse this order: we begin with the probabilistic 
postulates of quantum mechanics and derive the Schrédinger wave equation as a 
consequence, thus giving a new physical interpretation to that equation. 

In the path integral approach, the evolution of a state is given by the transition 
function K(b, a). From a classical point of view, this can be viewed as the ana- 
logue of Huygen’s principle, where the evolution of a wave can be determined by 
assuming that each point along a wave front emits a new wave front. The integra- 
tion over all these infinitesimal wave fronts then gives us the overall evolution of 
the wave front. Mathematically, this is given by: 


Nee tije / KG een tee (8.34) 


Earlier, we derived, assuming only the Lagrangian (1/2)m wr, an expression 
for the nonrelativistic transition function. Now let us calculate how wave fronts 
move with this transition function. The time evolution, from t to ¢ + df, is given 
by: 


wonree= f Aq exp (mo) wy, t) dy (8.35) 


phe 2€ 


where: 


rie \!? 
a=( =) (8.36) 


To perform this integration, let dy be replaced by dy, where n = y — x: 


W(x,tte)= / Aa lei’ ix +n, t)dn (8.37) 
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Now Taylor expand the left-hand side in terms of t, and the right-hand side in 
terms of 7: 


ay = = —1imn? /2e 
W(x, tte = [4 eg 
ve aw 
« (ver iat +57 aor t)a (8.38) 


The integration over dn is easily performed. The integration over the linear term 
in 7 vanishes because it is linear, and the integration over the higher terms vanish 
in the limit € — 0. This gives us: 


i— = ————— (8.39) 


This is the Schrédinger wave equation, as desired. It is straightforward to in- 
sert a potential into the path integral, in which case we derive the Schrédinger 
wave equation in a potential, which is the traditional starting point for quantum 
mechanics. 


8.3. From First to Second Quantization 


So far, we have only investigated the path integral formalism in the first quantized 
formalism, reproducing known results. The reader may complain that the path 
integral formalism is an elaborate, powerful machinery that has only rederived 
simple results. However, when we make the transition to the second quantized 
formalism and eventually to gauge theory, we will find that the path integral 
approach is the preferred formalism for quantum field theory. We saw earlier that 
the integration over all intermediate points along a path was enforced by inserting 
the number “1” at each intermediate point: 


i bee t) fo (xj, t;| (8.40) 


The transition to field theory is made by introducing yet another expression 
for the number “1,” this time based on an integration over an infinite number of 
degrees of freedom. We will use the familiar Gaussian integration: 


2 
8; = v7 [I diligent x; x;) Jewn(- So (8.41) 
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Now let us replace the variable x; with a function w(x), which is temporarily 


viewed as a discretized number w,, where x is now seen as an infinite discrete 


index. 
The transition from finite degrees of freedom to infinite degrees of freedom is 


then made by inserting the following expression for “1” into the path integral: 


bp yi / Dy Dy" We by exp — 2 vv.) (8.42) 
where: 
Dy =| | av. (8.43) 


Written in terms of functions w(x) rather than discretized variables y,, we 
now have: 


d(x —y)= / Diy Dy* y*(x)h(y) exp — (/ Ds\wn)) (8.44) 


These expressions can also be rewritten in terms of bra and ket vectors as follows: 


w(x) = (x|v) 
w(x) = (vx) (8.45) 
This allows us to write: 
d~%—-y) = [ow Wy" (x)(y) exp — (/ Dx vow) 
= (civ) f 0% exp—{ f (wl) Dz (lv?) (wb) 
= (x|1|y) (8.46) 
Written in this language, the number “1” now becomes: 
1=|v) faye (y (8.47) 


We can now repeat all the steps used in making the transition from the Lagrangian 
approach to the operator approach in the Heisenberg picture by inserting this new 
expression for the number “1” into the path integral. When we do this, we then 
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have an expression for the transition function written entirely in terms of w(x). A 
straightforward insertion of this new set of intermediate states yields: 


Ab 


pe 


K(b, a) 


/ Diy DW* W(xa)* Wap SAVOY (8. 4g) 


where the Lagrangian is equal to: 
wea 


At this point, we have now derived a second quantized version of the non- 
relativistic Schrédinger equation. This may seem odd, since usually quantum 
field theory is associated with the merger of relativity and quantum mechanics. 
But quantum field theory can be viewed independently from relativity; that is, 
the essence of quantum field theory is that it has an infinite number of quantum 
degrees of freedom. In this sense, the path integral formalism can accomodate a 
nonrelativistic Schrédinger field theory. 

Next, we would like to compute the familiar expressions found in Chapter 3 
and 4 in terms of the path integral approach. First, we define the average (O) of 
the expression O by inserting it into the integral: 


ae 
(O\=N / DX exp (- ye 5Disi) O (8.50) 
i,j=l 


Our goal is to find an expression for (O), and later make the transition to 
an infinite number of degrees of freedom (n — oo). To find an expression for 
this average, we will find it convenient to introduce an intermediate stage in the 
calculation. We define the generating functional as follows: 


n n 1 n 
HUD). J) = [Te exp (. SS gO oF > si (8.51) 
pil i=] 


i,j=l 


where we fix N by setting /(D,0) = N7!. 

To find an expression for /(D, J), we first make a similarity transformation 
x! = Sj;;x;, such that S diagonalizes the matrix D. We are then left with the 
eigenvalues of the matrix D in the integral. The integration separates into a 
product of independent integrations over x/. We then perform each integration 
separately, giving us the square root of the eigenvalues of D matrix. The product 
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of the eigenvalues, however, is equal to the determinant of the D matrix, giving 
us the final result: 


“1 
I(D, J) = (277)""(det Dij)~'” exp (x 54D) (8.52) 
i,k 


Our goal is to evaluate the average of an arbitrary product x) x2 +--+ Xp: 


n a 
(x1 X2 ae es) = iN / | [ex X1XQ +++ Xn EXP (- jDi>)) (8.53) 
=| i,j 


where the normalization constant N can be fixed via: 
1(D,0) = N~! = (2)"/? (det D;;) (8.54) 


We can also take repeated derivatives of /(D, J) with respect to J, and then set J 
equal to zero. Each time we take the derivative 0/0J;, we bring down a factor of 
x; into the integral: 


ae 
(om) = [] a 71D. DI. 
Laz J=0 


Le Dee (8.55) 


pairings 


This expression, for an odd number of x’s, vanishes. However, for two x’s, we 


have: 
qa (Day, (8.56) 
For four x’s, we have: 
(axjan) = [YD MYu 
HDD“) + (D-YiDYe]| B57) 


Not surprisingly, if we analyze the way in which these indices are paired off, we 
see Wick’s theorem beginning to emerge. The point here is that Wick’s theorem, 
which was based on arguments concerning normal ordering of operators, is now 
emerging from entirely c-number integrations. 

The transition to quantum field theory, as before, is now made by making the 
transition from a finite number of variables x; to an infinite number of variables 
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g(x). As before, we are interested in evaluating the transition probability between 
a field at point x and a field at point y: 


/ ioe Onin ( / cS L(@)) (8.58) 


To evaluate this integral, we will find it convenient, as before, to introduce the 
generating functional: 


Z(J)=N i Doe! f @x1L@+I@O)I (8.59) 
where: 
De = |] 490) 
yl = / Doe (8.60) 


To perform this integration for a Klein—Gordon field, we will repeat the steps 
we used for the simpler theory based on finite number of degrees of freedom. We 
first make a shift of variables: 


$(x) > $(x) + O(@)a (8.61) 


where ¢, satisfies the Klein—Gordon equation with a source term. We recall that 
the Feynman propagator is defined via: 


(4,% + m7), Ap(x — y) = —8*(x — y) (8.62) 
A classical solution can then be defined via: 
ba=— f Ae -yIQ) ay (8.63) 
which satisfies: 
(3.0% + m*)ha = I(x) (8.64) 


We can now perform the integral by performing a Gaussian integration: 


Z(J) = exp (-; / d*xd*yJ(x)Ar(x — 10) (8.65) 
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where: 
N~! = [det (a2 +m?)]'” = ii Dé expi ( ii a'xL(6)) (8.66) 
Using the fact that: 
—_ J(y) = 5x — y) (8.67) 
bJ(x) 
we find that the transition function is given by: 


. Le ee 
- De® = FFI GZMlsa0 (8.68) 


The average of several fields taken at points x; is now given as follows: 


i” (O/T b(x1)P(2) - -- 6(%n)|0) 
5"Z(J) 
5I(x1)8I (x2) «+ 8 (Xp) l=0 


AOa, XQ, ** oXn) 


(8.69) 


By explicit differentiation, we can take the derivatives for four fields and find: 


5 
- (1 sats) Z(F)| pq = Av (a — x2) Ar (xs — x4) 
+ Apa — x3)Ap(x2 — x4) + Az — x4)AP(x2 — 3) (8.70) 


In this way, we have derived Wick’s theorem starting with purely c-number 
expressions. 

Z(J) can also be written as a power expansion in J. If we power expand the 
generating functional, then we have: 


2) = SO foe fate ata Se) Seana “+Xq) (8.71) 
n=) 
where: 


a 
BI(x1)- IGn) 


i” (O\T (x1): = 6(%n)|0) (8.72) 


ZX, eee) ZO) 5x. 


In this way, the path integral method can derive all the expressions found earlier 
in the canonical formalism. 


8.4. Generator of Connected Graphs 279 


8.4 Generator of Connected Graphs 


In analyzing complicated Feynman diagrams, we must distinguish between two 
types of graphs: connected and disconnected graphs. A graph is called discon- 
nected when it can be separated into two or more distinct pieces without cutting 
any line. 

The generating functional Z(/) that we have been analyzing generates all 
types of Feynman graphs, both connected and disconnected. However, when we 
apply the formalism of path integrals to a variety of physical problems, including 
renormalization theory, it is often desirable to introduce a new functional that 
generates just the connected graphs. The path integral formalism is versatile 
enough to give us this new generating functional, which is denoted W(J). We 
define this generator as follows: 


20), = 42" 
WJ) 


—i log Z(J) (8.73) 


If we take repeated derivatives of W(J) to calculate the relationship between Z 
and W, we find: 


OW. . § WOZ eZ i a (8.74) 
SI (x1) I (2) Z* 8 (x1) 6 (2) Z 8 (x1) 8 (2) , 
and: 
sow 7 ( 262 3°Z x erm. 
BT (x1) 8T (0) (x3) Ia) \Z? BT G1) BI (Hp) B(x) Ba) 


i A 


~ Z bI(x1) 8 (x2) 8 (x3) 8 (x4) (ei) 


To analyze the content of these equations, let us power expand W(J/) in powers 
of J: 


WI = OH f day -deetn) HW.) B76) 
ed) 


Taking J = 0, we arrive at: 


iW (x1, x2) = Z (x1, x2) (8.77) 


This is not surprising, since the propagator is connected. The expansion, how- 
ever, becomes nontrivial when we consider expanding out to fourth order, where 
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disconnected graphs enter into the functional: 


W(x, x2, %3,%4) = F(Z, x2)ZCrs, x4) + perm. ] 


iZ (x1, x2, X3, X4) (8.78) 


This equation can be checked to show that W generates only connected graphs. 
For example, in ¢* theory to order 4, we can show that this works as indicated. In 


this case: 
Xr 
Z (x1, x2) = —i Ap (xy — x2) + 5 fat Ar(x; —z)Ar(z — x2) Ar(z, z) (8.79) 


while: 


Z% (x1, x2,%3,%4) = —[Ar(x1 — x2)Ar(x3 — x4) + 2 terms] 


= ¢ (| d*zAr(x — z)Ar(z, z)Ar(z — x2)Ar(x3 — x4) +5 tems 


— Ff d*zA r(x, — z)ArO2 — 2) 
x Ar(x3 — z)Ar(x%4 — z) +23 ms) (8.80) 


Inserting these factors back into the identity for W (x1, x2, x3, x4), we find that 
the disconnected pieces cancel, and the only term which survives is the connected 
piece, which forms the topology of a cross. 

Next, we would like to find the generating functional for proper vertices I, 
which is essential in a discussion of renormalization theory. Proper vertices (or 
one-particle irreducible vertices), we recall, appear when we consider renormal- 
izing coupling constants. We define [(@), the generator of proper vertices, via a 
Legendre transformation as: 


P(@) = wD f deseo (8.81) 


(From now on, we will use the symbol ¢ to represent the c-number field.) 

The fields ¢ and J have a nontrivial relationship between them. They are not 
independent of each other. In fact, by taking repeated derivatives, we can establish 
the relationship between them. Let us take the partial derivative of the previous 
equations with respect to J (keeping @ fixed) and with respect to ¢, keeping J 
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fixed. Then by differentiating both sides, we have: 


BW) EG) 
STG 79 Fagg TIO (8.82) 


Let us take repeated differentials of the above equations. Differentiating by J and 
by ¢, we find: 


| : sw _ b6(x) 
G(x,y) = ~ $J(x) BIG) ~ 8J(y) 
2 
ae ST  — 6bJ(x) (8.83) 


5¢(x) 6O(y) 8 O(y) 


If we treat T(x, y) and G(x, y) as matrices with continuous space-time indices, 
then they are inverses of each other, as can be seen as follows: 


- i sa er 
bJ (x) dJ(y) 6O(y) 50(z) 
4. 9P(x) dJ(y) 
| Aas 
5G(x) 
b(z) 
d4(x —z) (8.84) 


ii d*y G(x, y)P(y, z) 


We would now like to establish a relationship between third derivatives of 
the functionals. If we differentiate the previous equation by J(u), we find that it 
vanishes. Thus, we find the relationship: 


i SW 8° 
| ST) 5I(u) 5I(y) 86(y) 56(2) 


d*y’ eT 8.85 
/ VY ST IO) sou!” Cu eempeen °° 


where we have the fact that: 


3 ams. age es ise 
5I (a) -/ Oe) ion ar 5, | 
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Figure 8.3. Graphic representation of the relationship between W(J) and I'(¢) out to third 


order. 


We can simplify this equation a bit by inverting the matrix I(y, z) that appears. 
Then we find: 


ow 
5 J (x) 6J(y) dJ(z) 


— f atx! d*y’ d*z' 


&T 
5b(x') 5h(y’) 5h(z’) 
(8.87) 


G(x, x')G(y, y’)G(, z’) 


x 


Graphically, this is represented in Figure 8.3. In this way, we can derive relation- 
ships between the generating functionals. However, taking repeated derivatives 
becomes quite involved when we increase the number of legs. There is yet another, 
perhaps more direct way in which to see the relationship between the various gen- 
erating functions by taking power expansions. As before, we can power expand 
as follows: 


a 
r=) 5 | dx, + AXq G(X) +» Pn )TO (1, + Xn) (8.88) 


n=0 


We want a way to compare I”) and W. To solve this problem, we would 
like to power expand @ in terms of J. By Taylor’s theorem, we know that a power 
expansion of @ can be expressed in terms of powers of J, with coefficients given 
by the nth derivative of @ with respect to J. However, we already know from Eq. 
(8.83) that the derivative of @ with respect to J is given by the second derivative 
of W with respect to J. Thus, from Eq. (8.82), we find: 


meee | atx, W%x, mJ) + 5 / (YE | 
(8.89) 
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We also know the converse, that we can power expand J in terms of @. 
Likewise, we know from Taylor’s theorem that the coefficient of each term is 
given by the nth derivative of J with respect to @. Thus, Eqs. (8.82) and (8.87) 
give us: 


ee = ~ f dx [W2]* x, x1)6(x1) 
] = 5 
— xy f Myidt yd ysd*xid*xo [WO] (x, ys) [WO] Grr, vd 
x [W2] (er, w2)WOCN1, yo, ys)P(E)P(K2) ++ (8.90) 


Let us introduce what is called the “amputated” functional: 
is n 
W (x1, +--+, Xn) = | [ [at Ge, WW OL ++ yn) (8.91) 
i=l 


This has a simple meaning. Since the connected piece W” always has propagators 
connected to each external leg, this means that W is just the connected part minus 
these external legs. That is the reason why it is often called the “amputated” 
Green’s functions for connected graphs. 

Now that we have solved for W, we can now solve for the relationship 
between W and I’. Equating terms with the same power of ¢, we find: 


W)(x1, x2) 


— (TC, x] 


T(x}, x2, x3) 


W (x1, x2, x3) 


W (x1, x2, x3, x4) TOC, x2, x3, x4) 


= [aya TO (x1, x2, y)W(y, z) 


x T(z, x3, x4) +2 terms (8.92) 


Thus, to any desired level of expansion, we can find the relationship between 
the proper vertices and the amputated Green’s function for connected graphs. 
The advantage of this path integral approach is that our results are independent 
of perturbation theory. Without having to use complicated graphical techniques, 
we can rapidly prove nontrivial relations between different types of vertices and 
propagators. This will prove useful in renormalization theory. 


284 Path Integrals 
8.5 Loop Expansion 


Up to now, we have only explored the power expansion of the S matrix in powers 
of the coupling constant. However, in quantum field theory it is often convenient 
to expand in a different power series, one based on the number of loops in a 
Feynman diagram. In this section, we will show that the loop expansion, in turn, 
corresponds to a power expansion in A, Planck’s constant. (To show this, we must 
reinsert all i factors that were eliminated when we originally set / = 1.) 

The expansion in loop number orf has important implications. For example, 
in Chapter 6 we learned that the Feynman tree diagrams for various scattering 
amplitudes simply reproduced the results of the classical theory. A complicated 
tree diagram may appear with a large number of coupling constants, but it still 
only corresponds to the classical theory. To see the true effects of quantization, 
we have to go beyond the tree diagrams and study loops. A power expansion in 
the loop number or /, rather than the coupling constant, therefore measures the 
deviation of the quantum theory from the classical theory. 

Similarly, loop effects are also important when discussing radiative correc- 
tions to a quantum field theory. If a Kletn—Gordon theory, for example, has an 
interacting potential V(@), then radiative corrections will modify this potential. 
These radiative corrections, in turn, are calculated in the loop expansion. These 
loop corrections are important because they shift the minimum of V(@) and hence 
change the vacua of the theory. This expansion will prove useful when calculating 
radiative corrections to the effective potential in Chapter 10. 

In this section, we will show that the path integral gives a very convenient 
way in which to power expand a theory in the loop order. Let us now rewrite 
our previous expressions for the generating functional, explicitly putting back all 
factors of h that we previously omitted. We know from dimensional arguments 
and the definition of the path integral that the action appears in the functional as 
S/h, so the generating functional Z(J) can be written as: 


VAGi ES / Do exp ( / [FZ +h I(x)d(x)] d*x) (8.93) 


Since we are interested in the relation betweenfi and the standard perturbation 
theory, let us divide the Lagrangian into the free and interacting parts, “% = 
9 + %,. We can extract %; from the path integral by converting it into an 
operator: 


Z(J) = exp E / ax F, (-7)| Zo(J) (8.94) 
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where Zo(J) is just a function of the free Lagrangian ‘“o. If we evaluate the 
derivatives, then the exponential in front of Zo simply reproduces .%;(@) in the 
exponential, giving us back the original expression for Z(J/). 

The advantage of writing the functional in this fashion is that we can now 
explicitly perform the functional integral over @ in the free partition functions 
Zo(J), leaving us with the standard expression: 


ZEW) [exp (- zs / d*x d*y JI@Ar(x - 10) (8.95) 


a 


Now we can begin the counting off in a typical Feynman graph by inserting 
Zo back into the expression for Z(/). For any Feynman diagram, each propagator, 
from the previous expression for Zo, is multiplied by A. However, each vertex, 
because it appears in the combination “,/A, is multiplied by a factor ofh—'. 

For an arbitrary Feynman graph, the total counting off is given by Ai raised 
to the power of P — V, that is, the number of propagators minus the number of 
vertices. However, we know from our discussion of renormalization theory in the 
previous chapter that: 


id an Ae (8.96) 


where L is the number of loops in a Feynman diagram. For any Feynman diagram, 
we therefore pick up an overall factor of: 


fee =f (8.97) 


It is now easy to see that a power expansion inf is also a power expansion in 
the loop number. In Chapter 10, we will use this formalism of loop expansions to 
calculate the radiative corrections to several quantum field theories, showing that 
the loop expansion is powerful enough to shift the minimum of the potential V(@) 
via radiative corrections. This will prove essential in isolating the true vacuum of 
a theory with a broken symmetry. 


8.6 Integration over Grassmann Variables 


Matter, of course, is not just bosonic. To incorporate fermions in the path integral 
formalism, we must define how to integrate over anticommuting variables. To 
do this, we must use the Grassmann variables, which are a set of anticommut- 
ing numbers satisfying 0,0; = —0;6;. Integration over Grassmann variables is 
problematic, since 6? = 0 for an anticommuting number, and hence the entire 
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foundation of calculus seems to collapse. However, a clever choice allows us to 
generalize the path integral formalism to fermions. 

One of the features that we would like to incorporate in an integration over 


Grassmann variables is the fact that the integral over all space is translationally 
invariant: 


‘a dx d(x) = ‘he dx d(x +c) (8.98) 


—co 


Let us try to incorporate this feature in an integration over a Grassmann variable: 


if dé $(8) = i dé f(6 +c) (8.99) 


An arbitrary function @(@) can be easily decomposed in a Taylor expansion, which 
terminates after only one term, since higher terms are zero: 


(6) =a+bd (8.100) 
Let us now define: 


Io 


[- 


h [ #0 (8.101) 


Inserting this power expansion into the integral, we find: 
[« $(0) = alg + bl, =(a+be)lo + bh, (8.102) 


In order to maintain this identity, we choose the following normalizations: 


Ip = 
f= A (8.103) 


This, in turn, forces us to make the following unorthodox definitions: 


[+ = 0; [eoo=t (8.104) 
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This is anovel choice of definitions, for it means that the integral over a Grassmann 
variable equals the derivative: 


a 
[i a5 (8.105) 


Now, we generalize this discussion to a nontrivial case of many Grassmann 
variables. With N such variables, we wish to perform the integration: 


N N 
I(A)= / | | 46:46; exp (>: 14,01) (8.106) 
i=] ij=l 


where @; and 6; are two distinct sets of Grassmann variables. To evaluate this 
integral, we simply power expand the exponential. Because the Grassmann inte- 
gral of a constant is zero, the only term that survives the integration is the Nth 
expansion of the exponential: 


I(A) = iy [J+ db; — m(D? Aij9; ) (8.107) 


tj=l 


Most of the terms in the integral of the Nth expansion are zero. The only terms 
that survive are given by: 


N 
/ | [44 dé; ia (> ee"A 1;, Ari, Asi, oat an 
perm 


i=] 


I(A) 


det A (8.108) 


An essential point is that the determinant appears in the numerator, rather than the 
denominator. This will have some significant implications later when we discuss 
ghosts and the Faddeev—Popov quantization program. 

Now we make the transition from 6; to the fermionic field w(x). We introduce 
two sources 7 and # and define the generating functional: 


Z(n, i) =N i Dip Dye aera (8.109) 
where: 
Z=Wliy"d, —m)w (8.110) 
As before, we can perform the functional integral by shifting variables: 


W(x) > W(x) + Walr) (8.111) 
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where: 
vax) = — f See — yn) dy (8.112) 
Performing the integral, we find: 
Z(n, ) = exp i d*x d*y( — in(x)Sr(x — y)nQ)) (8.113) 


where: 
N = det (if — m) (8.114) 


The averages over the field variables can now be found by successively dif- 
ferentiating with respect to the 7 and 7 fields and then setting them equal to 
zero: 


5 5 
d7n(x) dn(y) 


4 i Dv D¥ F)wwe lo — (8.115) 


— Spe ay) ZOD) i ap 


Successive integrations over the source fields gives us: 


n 8 n 3 
re , _ ; 
II 6n(xi) I] Son" Dl eno 


= (i) (O/T Px) ++: POn)WOr) ++: VOn)|0) (8.116) 


By performing the functional integration with respect to n and 7, we once again 
retrieve Wick’s expansion for fermionic fields. 


8.7 Schwinger—Dyson Equations 


The functional technique allows us to formulate QED based on the Schwinger— 
Dyson integral equation. This equation, when power expanded, yields the standard 
perturbation theory. But since we do not necessarily have to power expand these 
equations, these integral equations also apply to bound-state and nonperturbative 
problems. The Schwinger~Dyson equation is based on the deceptively simple 


8.7. Schwinger-Dyson Equations 289 


observation that the integral of a derivative is zero: 


[ vo% = 0 (8.117) 


Although this statement appears trivial from the point of view of functional analy- 
sis, it yields highly nontrivial relations among generating functionals in quantum 
field theory. 

In particular, let us act on the generating functional Z(J/) for a scalar field 
theory: 


) 
0= ii Dox, exp (iste) +i f a'xs0) (8.118) 
The functional derivative of the source term simply pulls down a factor of J. We 
simply get: 


O= i Do [iS'(o) + ei] exp (is vi fats 116609) (8.119) 
This can be rewritten as: 
! 0 = ; 
[s (5) +] Zs) = 0 (8.120) 


This is the Schwinger—Dyson relation, which is independent of perturbation theory. 
At this point, we can take any number of derivatives of this equation with respect 
to the fields and obtain a large number of integral equations involving various 
Green’s functions. Or, we can power expand this equation and reproduce the 
known perturbation theory. 

For QED, the generalization of Eq. (8.120) reads: 


r) rt) 6.5 
Si 0 8.121 
lsac8 (tg it ix "| (J,n, 7) ( ) 

Our strategy will be to convert this expression for Z(/) into an expression for 
W(J) and to an expression for [(@). Then we will take a derivative with respect 
to A,, and set all sources to zero. We begin by using the fact that: 


6S 


—— =[8g,, —(1 —a@7')d,d,| A” — evry (8.122) 
5A, 


We can rewrite the previous equation as an equation on W(J): 


“ éW 
Ju + [8 gu, — Ud —@7")0,9, | oa 
v 
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dbW dW 6 éW 
7 a poe a 0) 8.123 
a 7) ue eY7] “on @ én ) ( ) 


Now let us convert this expression for W(J/) into an expression for '(@), defined 
by the Legendre transformation: 


Wn ae / dx (J,At + On +h) (8.124) 


We must make the substitution: 


Au = syn! 5h 
ar ar 

= SS 8.125 

Ju yA n by ( ) 


Then the Schwinger—Dyson equation can be written as: 


é6r 


TAM) es, be -(1- a)3,8, |A*(x) 


2 =! 
-ir ln () Go] e129 


where the last term on the right is proportional to the electron propagator, and we 
have used the fact that: 


SW sr 


Pr Ga EN A al es 8.127 
; bNa(x)d7y(z) dp, (z)dWe(y) ln=i=y=9=0 ‘ ) 


— Sap5*(x — y) = if d* 


Our last step is to take the derivative with respect to A”. Then the term on the 
right is related to the photon propagator. The final expression becomes: 


rT 


a = 22 = | 4 = 
5AH#(x)5A"(y) lA=y=9=0 [a Suv Oe )4,,0, | 5° y) 


+ ie? / d*u d*u Tr [yy Sp(x, uAy(y, u, v)Sp(v, x)] (8.128) 


where we have defined the vertex function as: 


eT 


SAM(x) BUC) BYU) lacwapen — AHO V2) (8.129) 
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We have also used the formula for taking the derivative of an inverse matrix: 


8 5 
sg! = ie! 
ee 5A 


bAy mn 


MA ' (8.130) 


where .4% = 8°1'/5yh 5 and: 


S-(x, y) =" 


era: 
Geae 


d*p e iP —y) 
= 0 8.131 
(x)! f —m — Sp) ean) 


These functional relations, in turn, are identical to the Schwinger—Dyson equations 
introduced in Exercise (7.11). 

In summary, we have seen that the path integral method of Feynman is not 
only elegant and powerful, it is also very close to the original spirit of quantum 
mechanics. The formalism is so versatile that we can reproduce the canonical 
formalism discussed earlier, as well as quantize increasingly complicated theories, 
such as Yang—Mills theory and quantum gravity. The path integral formalism, in 
fact, has become the dominant formalism for high-energy physics. In Chapter 
9, we will see the power of the path integral approach when we quantize the 
Yang-Mills theory. 


8.8 Exercises 


1. Using path integrals for a free Dirac theory, calculate the expectation value 
of the product of six fermionic fields in terms of propagators. Show that 
the resulting expression is equivalent to the decomposition given by Wick’s 
theorem. 


2. Prove Eq. (8.14). 


3. Prove that the disconnected pieces in Eq. (8.78) cancel to second order in A. 
Sketch how the cancellation works at third order. 


4. Let 0; be a Grassmann column vector. Let us make the transformation of 
variables: ¢; = Mj;0;. We define: 


[+ dg --- dbp, (ida 4.)= f a; dO, --- dO, (0:62 ---O,) (8.132) 
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Prove that this implies: 
dd, dor --- db, = (det M)~'d6, dO, --- dO, (8.133) 


which is the opposite of the usual rule for differentials. 


5. Derive the Schrodinger equation for an electron in the presence of a potential 
V(x) using path integrals. 


6. For invertible, square matrices A, B, C, D, prove: 
@ ) - Cee | 1 | 
D 96 0 i D B 
g ( A 0 ( ete: 
Tel ey Dla 0 B-DA™'C 
7. For matrices A, B, prove: 
log(AB) = log A + log B + [log A, log B] + (8.134) 
8. Prove: 
det(1+M)=1+ TrM+ ; {(TrM)* — Tr(M7)] +--- (8.135) 
9. For matrices A and B, prove, by power expansion, that: 


1 1 1 
A eB — ya aa a a 
eve exp ( + + 514, Bl + DD [A,[A, B]] + 79 [B. 1B, All + 


(8.136) 
10. For matrices A and B, prove that: 
eie® = exp[A, B] eF eA 
1 
eAt8 = exp (-314. s1) eAe8 (8.137) 
Under what conditions are these identities valid? 
11. For matrices A and B, prove: 
n(in+1 
(e4e%)" = exp (““u. s1) (pee cs (8.138) 


Are there any restrictions on this formula? 


8.8. 


| 


1 


1S: 
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Prove: 


1 n 
[, ---d0> d0 exp 5 ) > 6;Dij6; = Vdet D (8.139) 


< 


i,j=l 
for the case n = 3 by an explicit calculation. 


Prove the previous relation for arbitrary n. Prove it in two ways: first, 
by diagonalizing the D matrix and then performing the integration over the 
eigenvalues of D; second, by power expanding the expression and using the 
known identities for the antisymmetric € tensor. 


. In Eq. (8.92), we established a relationship between the W and I. Find 


the relationship between the fifth orders, then graphically illustrate what it 
means. 


For ¢7 theory, show that I? is actually one-particle irreducible. Expand it 
only to fourth order in the coupling. 


Chapter 9 
Gauge Theory 


We did not know how to make the theory fit experiment. It was our 


Judgment, however, that the beauty of the idea alone merited attention. 
—C.N. Yang 


9.1 Local Symmetry 


An important revolution in quantum field theory took place in 1971, when the 
Yang-Mills theory was shown to be renormalizable, even after symmetry break- 
ing, and therefore was a suitable candidate for a theory of particle interactions. 
The theoretical landscape in particle physics rapidly changed; a series of impor- 
tant papers emerged in which the weak and strong interactions quickly yielded 
their secrets. This revolution was remarkable, given the fact that in the relative 
confusion of the 1950s and 1960s, it appeared as if quantum field theory was an 
unsuitable framework for particle interactions. 

Historically, gauge theory had a long but confused past. Although the Yang— 
Mills equation had been discovered in 1938 by O. Klein! (who was studying 
Kaluza—Klein theories), it was promptly forgotten during World War II. It was 
resurrected independently by Yang and Mills* in 1954 (and also by Shaw? and 
Utiyama‘), but it was unsuitable for particle interactions because it only described 
massless vector particles. The discovery by ’t Hooft’ that the theory could be made 
massive while preserving renormalizability sparked the current gauge revolution. 

Previously, we studied theories which were symmetric under global symme- 
tries, so the group parameter € was a constant. But the essence of Maxwell and 
Yang-Mills theory is that they are invariant under a symmetry that changes at 
every space-time point; that is, they are /ocally invariant. This simple principle of 
local gauge invariance, as we shall see, imposes highly nontrivial and nonlinear 
constraints on quantum field theory. 
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We begin with the generators of some Lie algebra: 
(z?, aif (9.1) 


Let the fermion field yw; transform in some representation of SU(N), not 
necessarily the fundamental representation. It transforms as: 


pi(x) — Ox) Wj) (9.2) 


where {2;; is an element of SU(N). 
The essential point is that the group element 22 is now a function of space-time; 
that is, it changes at every point in the universe. It can be parametrized as: 


Q(x) = (ray (9.3) 


where the parameters 0°(x) are local variables, and where T° is defined in whatever 
representation we are analyzing. 

The problem with this construction is that derivatives of the fermion field are 
not covariant under this transformation. A naive transformation of the derivatives 
of these fields picks up terms like 9,2. In order to cancel this unwanted term, we 
would like to introduce a new derivative operator D,, that is truly covariant under 
the group. To construct such an operator D,,, let us introduce a new field, called 
the connection A,: 


D, = 0, —igA, (9.4) 
where: 
Ay(*) = A,e)e- (9.5) 


The essential point of this construction is that the covariant derivative of the 
w field is gauge covariant: 


au! — ig Aw’ 
QO, + (A.W — igA),Qy 


QDuwW (9.6) 


(Duby 


The troublesome term 9,,Q is precisely cancelled by the variation of the A, term 
if we set: 


Al (x) = — 21a MVOC) + Ux)Ap(xX)Q7"(x) (9.7) 
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Infinitesimally, this becomes: 


Sif, Sie ORO + OF A 
by = —ig0*r?w 


(9.8) 


[If we reduce SU(N) down to the group U(1), then we recover the field transfor- 
mations for QED. ] 

It is also possible to construct the invariant action for the connection field 
itself. Since D,, is covariant, then the commutator of two covariant derivatives is 
also covariant. We define the commutator as follows: 


i 
Puy = =(D D,| 
8g 
= 0,A, —0,A, —iglA,, Av] 
= (OAS — OVAR + 274A Ac) (9.9) 
Because D,, is genuinely covariant, tis means that the F’’,, tensor is also covariant: 
Fyy 7 OF 9"! (9.10) 


We can now construct an invariant action out of this tensor. We want an action 
that only has two derivatives (since actions with three or higher derivatives are 
not unitary, i.e., they have ghosts). The simplest invariant is given by the trace of 
the commutator. This is invariant because: 


ir ( Oi Oe Or Os | inh et) (9.11) 


The unique action with only two derivatives is therefore given by: 


1 v | a ajsv 
sx f ats (—$1 Fv ) = f ats (-ZFiF H ) (9.12) 


This is the action for the Yang—Mills theory, which is the starting point for all 
discussions of gauge theory. 

The field tensor F,,,, we should point out, also obeys the Bianchi identities. We 
know, by the Jacobi identity, that certain multiple commutators vanish identically. 
Therefore, we have: 


[D,., [Dy. Dol] + [Dv 1Dp, Dulll+[Do. (Du, Dol] = 0 (9.13) 
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This is easily checked by explicitly writing out the terms in the commutators. 
Written in terms of the field tensor, this becomes: 


[Du Fy] ar [D,, Foul mF [Dp, Fav] =0 (9.14) 


(It is important to stress that these are exact identities. They are not equations of 
motion, nor are they new constraints on the field tensor.) 

Lastly, since ¥ — WOQt and Duw > QD,Y, it is easy to show that the 
invariant fermion action coupled to the gauge field is given by: 


S= f d'x HG D— my (9.15) 


9.2 Faddeev—Popov Gauge Fixing 


The real power of the path integral approach is that we have the freedom to choose 
whatever gauge we desire. This is impossible in the canonical approach, where 
the gauge has already been fixed. However, in the path integral approach, because 
gauge fixing is performed by inserting certain delta functions into the path integral, 
we can change the gauge by simply replacing these factors. This formalism was 
introduced by Faddeev and Popov.® 

Historically, however, before the Faddeev—Popov method, the quantization of 
Yang-Mills theory was not clear for many years. In 1962, Feynman’ showed that 
the theory suffered from a strange kind of disease: The naive quantization of the 
theory was not unitary. In order to cancel the nonunitary terms from the theory, 
Feynman was led to postulate the existence of a term that did not emerge from 
the standard quantization procedure. Today, we know this ghost, first revealed by 
Feynman using unitarity arguments, as the Faddeev—Popov ghost. 

To begin, we first stress that the path integral of a theory like Maxwell’s theory 
of electromagnetism is, in principle, undefined because of the gauge degree of 
freedom. Because the path integral DA and the action S are both gauge invariant, 
it means that functionally integrating over DA will eventually overcount the 
degrees of freedom of the theory. Because the Maxwell theory is invariant under 
the gauge transformation: 


AQ = Ay + 4,2 (9.16) 
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this means that functionally integrating over both Ae and A,, will overcount the 
integrand repeatedly. In fact: . 


/ DAy,e’> = 00 (9.17) 


If we begin with one field configuration A,, and consider all possible A®, then 
we are sweeping out an “orbit” in functional space. The problem is that, as Q 
changes along the orbit, we repeat ourselves an infinite number of times in the 
path integral. Our problem is therefore to “slice” the orbit once so that we do not 
have this infinite overcounting. 

This is the origin of the gauge-fixing problem. To solve this problem, one is 
tempted to insert factors like: 


6(0, A"); 8(V-A) (9.18) 


into the path integral, forcing it to respect the gauge choice 0, A* = OorV-A=0. 
More generally, we would like to fix the gauge with an arbitrary function of 
the fields: 


5 (F(A,)) (9.19) 


which would fix the gauge to be F(A,,) = 0. 

The source of the problem is that inserting a delta function into the functional 
integration DA changes the measure of the integration. For example, we know 
that the delta function changes when we make seemingly trivial changes in it. For 
example, if we have a function f(x) that has a zero at x = a, we recall that: 


d(x — a) 


——___—_ 9.20 
Fel Si 


8 (f(x))= 


The choice 6 ( f (x)) differs from the choice 5(x — a) by aterm f’(x). Thus, there 
is an ambiguity in the measure of integration. To solve this ambiguity, we insert 
the number “1” into the path integral, which we know has the correct measure: 


1=Arp i, DQ 6 (F(A))) (9.21) 


where Arp is the Faddeev—Popov determinant, which guarantees the correct 
measure, and DQ) = a. d(x) is the invariant group measure. It satisfies the 
invariance property: 


DQ = D(Q'2) (9.22) 
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if 0’ is a fixed element of SU(N). Since Q = 1 — i0¢t* +---, the group measure 
(for small @) is equal to: 


DQ= I] d0"(x) (9.23) 


(More details concerning the invariant group measure will be presented in Chapter 


15.) 
Inserting the number “1” into the functional, we now have: 


[ >A, (A+ ; p0.3(F°ag))) iad (9.24) 


Our task is now to calculate an explicit result for the Faddeev—Popov determinant. 
To do this, we first notice that it is gauge invariant: 


Arp(Ay) = Arp(Ay) (9.25) 
This is because it is equal to the integration over all gauge (2 factors, and hence 
is independent of the gauge. However, it will be instructive to see this more 
explicitly. 


Let us change the gauge, replacing A, in the Faddeev—Popov determinant 
with Ane Then the definition becomes: 


/ Da’ | F(Az®)] 
J D[ag]6 (Far) 
/ DO's (F(Az’)) 


AgetA;,) (9.26) 


Sa) 


Thus, it is gauge invariant. Now we come to the crucial part of the calculation. Let 
us make a gauge transformation on the entire path integral, so that A,, > a. 
The measure DA, the Faddeev—Popov determinant, and the action S are all gauge 
invariant. The factor that is not gauge invariant is F (A®), which changes into 
F(A,,). The integral, after the gauge transformation, now becomes: 


(/ - 0) f DA, App 8(F(A,)) ff o* 2 (9.27) 


We have now accomplished the following: 
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1. We have explicitly isolated the infinite part of the matrix element, which is 
given by: 


| DQ = 00 (9.28) 


By simply dividing out by the { DQ, we remove the infinite overcounting. 
2. The gauge choice is now enforced by 6 (F(A,)). 


3. The essential point is that the factor Ap gives the correct measure in the path 
integral. This is the factor that was missing for so many years in previous 
attempts to quantize gauge theories. 


The problem of gauge fixing is therefore reduced to finding an explicit expres- 
sion for Ar¢p. From Eq. (9.21), we know that the Faddeev—Popov determinant is 
written in terms of 6(F (Ap), which in turn can be re-expressed via Eq. (9.20) as: 


5F(AR)(X) 


5 [F(A)] = 6(Q — Mo) det TED 


(2229) 


where the determinant is a functional generalization of the factor | f’(x)|. 
Thus, the Faddeev—Popov term can be written as: 


5 F(Aj)(x) 


Arp = det 5x) 


(9.30) 


for F(A,) = 0. To perform explicit calculations with this determinant, it is 
convenient to rewrite this expression in a form where we can extract new Feynman 
rules. To solve for the determinant, let us power expand the factor F (A?) for a 
small group parameter 0: 


F (AS (x)) = F(Ay(x)) + ‘i d*y M(x, y)O(y) +--- (9.31) 


In this approximation, only the matrix M survives when we take the derivative. 
The Faddeev—Popov term can be converted into the determinant of the matrix M, 
which in turn can be converted into a Gaussian integral over two fields c and c?: 


Arp = det M = ii De Dc expi ( | d‘x d*ycl(x)M(x, ye) (9.32) 


The determinant det M appears in the numerator of the path integral, rather than 
the denominator. This means that when we exponentiate the Faddeev—Popov term 
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into the Lagrangian, we must integrate over Grassmann variables, rather than 
bosonic variables, as we discussed in Section 8.6. Thus, we find that c and cl are 
actually “ghost” fields, that is, scalar fields obeying Fermi—Dirac statistics. This 
is the origin of the celebrated Faddeev—Popov ghosts. 

Alternatively, we could have rewritten the determinant as follows: 


Arp = det M = exp(Trlog M) (9:33) 


where the determinant is understood to be taken over discretized x and y and any 
isospin indices. (To prove that the determinant of a matrix M can be written as the 
exponential of the trace of the logarithm of M, simply diagonalize the M matrix 
and re-express the formula in terms of the eigenvalues of M. Then this identity is 
trivial.) This term can be written in more familiar language if we write the matrix 
as M = 1+L and expand as follows: 


det(1+Z) = exp Tr[log(1+ZL)] 


S(-1I" 1, 
exp y ee (9.34) 
n=1 


The trace over L, we shall see shortly, can now be interpreted as closed loops in 
the Feynman expansion of the perturbation theory. 

To gain some familiarity with this Faddeev—Popov term, let us compute the 
Faddeev—Popov determinant for the simplest gauge theory, Maxwell’s theory. 
Earlier, in the canonical and covariant approaches, we fixed the gauge without 
even thinking about complications due to the functional measure. We will find 
that we were lucky: The Faddeev—Popov measure is trivial for the gauges found 
in Maxwell theory, but highly nontrivial for non-Abelian gauge theory. To see 
this, let us choose the gauge: 


F(A, y= 0° Ay —0 (35) 
The variation of this gauge fixing gives us: 
F(AQ) = 0“ A, + 3"8,0 (9.36) 
Then the M matrix becomes: 


M(x, y) = [0,3"], O37) 


y 
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Written in terms of ghost variables c and ct, this means we must add the following 
term to the action: 


il d*x d*ycl(x)a"d,,c(y) (9.38) 


where the integration over c! and c must now be viewed as being Grassmannian; 
that is, these fields are scalar Grassmann ghosts. Fortunately, this determinant 
decouples from the rest of the theory. The determinant of the Laplacian does not 
couple to any of the fermions or gauge fields in QED, and hence gives only an 
uninteresting multiplicative factor in front of the S matrix, which we can remove 
at will. 

For the Coulomb gauge, we find a similar argument, except that the gauge 
variation gives us a factor: 


det V” (9.39) 


which also decouples from the theory. Thus, from the path integral point of view, 
we were fortunate that we could take these gauges in quantizing QED without 
suffering any problems. We are not so fortunate, however, for gauge theories, 
where the Faddeev—Popov term gives us highly nontrivial corrections to the naive 
theory. We begin by choosing a gauge for the theory, such as: 


aX A =0 (9.40) 


If we place this into the path integral, then we must at the same time insert the 
Faddeev—Popov measure: 


Arp = det (Mab (9.41) 


where x and y are discretized space-time variables, and a, b are isospin variables. 
The matrix M is easily found: 


5 
b,2Q = 
5a) # & = 5oaGz) 


—] 
Mao(x — y) (a 0,0" + ef.) (y) 


, 5 (5) (9.42) 


x, 


- ; (—87a"a, + gf?” A‘) 


If we rescale and exponentiate this into the functional integral, we find an additional 
term in the action given by: 


i dx cl (6780, =f Ac (9.43) 
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The c(x) are the celebrated Faddeev—Popov ghosts. These ghosts have peculiar 
properties: 


1. They are scalar fields under the Lorentz group, but have anticommuting statis- 
tics, violating the usual spin-statistics theorem. They are therefore ghosts. 
However, we do not care, since their job is to cancel the ghosts coming from 
the quantization of the Af, field. 


2. These ghosts couple only to the gauge field. They do not appear in the external 
states of the theory. Therefore, they cannot appear in tree diagrams at all; they 
only make their presence in the loop diagrams, where we have an internal 
loop of circulating ghosts. 


3. These ghosts are an artifact of quantization. They decouple from the physical 
spectrum of states. 


9.3. Feynman Rules for Gauge Theory 


Let us now put all the various ingredients together and write down the Feynman 
rules for the theory. The action plus the gauge-fixing contribution becomes: 


l 1 
B+ Dep =— FFP ~ 55 On Amty? (9.44) 


while the ghost contribution becomes: 


Ly = ‘| d*xd*y S° ch(x)[M(a, ylavco(y) (9.45) 
ab 
where: 
1 
LC Nh (57a, — ge*” AC) 54(x — y) (9.46) 


To extract the Feynman rules from this action, we will decompose the action 
into a free part and interacting part: 


1 1 
By = —7(8,A5 — An) — 55 OAL) +.640"Ca (9.47) 
and: 


1 
Bp = ~sB( A, — a Ape Ae Ae 
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— igcte a" Abc? (9.48) 
From this, we can read off the Feynman rules for Yang-Mills theory: 


1. The gauge meson propagator is given by: 


kk 1 
NS ONG INI NGS ONT NS ING * nab — _;sab ee Py: uv 
og: ‘ 7 iAiy(k) = id (s. (1 — a) 2 Fs (9.49) 
2. The (directed) ghost propagator is given by: 
SSS S556 pS een + Aab = 7540 50 
F ‘ : (ANG) id Pale (9.50) 


3. The three-gauge meson vertex function for mesons with momenta and quan- 
tum numbers (k;, 4, a), (k2, v, b), and (k3, A, c) is given by: 


Tes =. ige™| (kr — ka)sBur 


+ (eee kvBua (9.51) 


with )-_, ki = 0. 


4. The four-gauge vertex is given by: 


Ha v,b ee = ig’ lemme (geun _ 8vr8up) 
a ete cbde ( Sega = Seung) 
ee (surpv — 8pr8uv) | (9:52) 
p,d A,c 
4 
with ));_, ki = 0 
5. The two-ghost/gauge meson vertex function is given by: 
i, a 
oe eck, (9.53) 
Laan — 
a Bree where k,, is the incoming ghost’s momentum. 
€ b 
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Now let us derive the Feynman rules for the gauge field coupled to a fermion 
and a scalar field. We assume that the coupling to a scalar field is given by the 
matrix R“, taken in whatever representation we desire. The coupling is given by: 


B= wliy"( —igAir?)-—m]y 


A 
+[(a" — igaR)9]' [(8, — igAZ RDG] — mo! — zoey 


Then the Feynman rules for this theory are given by: 


1. The fermion propagator is given by: 


<> js8 5 = ( 
B,J p a, i F\PJap = p—mr+ie 
2. The scalar boson propagator is given by: 
SS repay ) id” 
. P FAP! pi — me? +ie 


3. The fermion—gauge-meson coupling is given by: 


H, a 


ig(Yp op ale 
now 
Bi On i 
4. The boson—-gauge-meson coupling is given by: 
La 


; ig(p + P')yRin 
4 

we ESS 

m ] 


5. The two-boson—gauge-meson coupling is given by: 
a — ig?guv{R", R”} 


6. The four-scalar couplings are given by —iA. 


(9.54) 


(9.55) 


(9.56) 


(9:97) 


(9.58) 


(959) 
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9.4 Coulomb Gauge 


To begin a discussion of gauge theory in the Coulomb gauge, it is first instructive 
to rewrite the original action as follows: 


1 v 1 v 
BER Bee (aA, — 3, Ane ef “AAG (9.60) 
where it is essential to notice that F7, is an independent field, totally unrelated to 
the vector field A’. (However, by eliminating the F,,, field by its equations of 
motion, we can show the equivalence with the standard Yang-Mills action.) This 
new version of the action is invariant under: 


5Aq 


1 a 
250" + Ff 78? AS, 
One een, (9.61) 


Now let us take the variation of the action with respect to F/,, as well as with 
respect to Af. The two equations of motion yield: 


re, O,AS — 0A + ef Al AG 


0 


iy la ido a Nao (9.62) 


This new action will prove to be useful when quantizing the theory in the 
Coulomb gauge. First, we will eliminate F% in terms of the A? fields, keeping 


ij 
Fé. an independent field. Written in terms of these variables, we find that the 
Lagrangian becomes: 


bar — 1 Rg(AyFA(A) + 5 si — — F% (9, AG — d9A7 + gf AGA‘) (9.63) 


Let us define: 


ge = Fo; 


i 


1 F 
Be = ear aJk(A) (9.64) 


In terms of these fields, we now have: 


F=EAM -H# (9.65) 
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where: 


FH = ~ [(E2)? + (B2)] + AG (EF + gf E7 Af) (9.66) 


NI 


Written in this form, the similarity to the quantization of Maxwell’s theory 
in the Coulomb gauge is apparent. Of particular importance is the Lagrange 
multiplier A2, which multiplies the covariant divergence of E’, which is the 
gauge generalization of Gauss’s Law found earlier for QED in Eq. (4.54). The 
equation of motion for A@ yields: 


DE? = 3; E? + gf EP AS =0 (9.67) 


which indicates that not all conjugate momenta are independent, a situation com- 
mon to all gauge theories. 

There are two ways in which to solve this important constraint. First, as in 
the Gupta—Bleuler approach, we can apply this constraint directly onto the state 
vectors of the theory: 


D; E?|¥) =0 (9.68) 


However, it is easy to show that D; E’, acting ona field A’, generates the standard 
gauge transformation; that is: 


| f ax aevieze0, A409] =i (ajA'(y) + ef" AS A“(y)) (9.69) 


X0=Yo 


Therefore the constraint equation means that the state vectors |W) of the 
theory must be singlets under the gauge group. This is a rather surprising result, 
because it implies that free vector mesons A% are not part of the physical spectrum. 
This is consistent, however, with our understanding of the quark model, where 
nonperturbative calculations indicate that the only allowed states are singlets 
under the “color” gauge group SU(3). The allowed singlets under the color group 
include quark—antiquark and three-quark bound states, which are the only ones 
seen experimentally. This will be important when we discuss the phenomenon of 
confinement in Chapters 11 and 15. 

The second approach in solving this constraint is to assume, for the moment, 
that perturbation theory is valid and simply drop the higher-order terms. Then the 
constraint equation reduces to the statement that the E¢ field is transverse. To see 
this, we will decompose the E? field into transverse and longitudinal modes: 


Ee Saat Ee (9.70) 
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where: 
V;E7' =0 (9.71) 


An explicit decomposition of any field E? into transverse and longitudinal 
parts can be given as: 


1 


i 
E? = (5, = vigVi) ES +V; v2 


= V,E% (9.72) 


We will find it convenient to factor out the longitudinal mode explicitly by 
introducing the field f°: 


pe _ Vf (9.73) 


Now insert these definitions back into the Gauss’s Law constraint, which now 
becomes: 


ViEj + gf ERAS 
= Vif — af Vi fPAl+ ef EMAL =0 9.74) 
For the case of Maxwell’s field, this constraint was trivial to solve, since the 
cross term was not present. For the Yang-Mills theory, this cross term gives us a 
nontrivial complication. 
Fortunately, we can now solve for f* by inverting this equation as a perturba- 
tion series. Let us rewrite it as: 
Day = fo 8 A (9.75) 
where: 
Dab = V> Sab — Bf” ASV; (9.76) 
To solve this equation, we must introduce the Green’s function: 


(V75% — gf 4 A?V;) A(x, y; A) = 8°3°(x — y) (9.77) 


Let us assume that we can invert this equation. (This statement, as we shall shortly 
see, is actually incorrect.) Inverting this equation, we find the solution for f*: 


f‘@e=z / dy A(x, y; A) fr AL) Edy) (9.78) 
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where: 


ek 1 
De, y, A) = may? | mera OY 


4:-- OO 
4n|x —z| my) 


4n|x — y| 


Let us now insert this expression back into the Hamiltonian, where we have 
explicitly soived for the Gauss’s Law constraint. The Hamiltonian becomes: 


H=5 | dx [ERY + (BYP + Vi FY] one 


and the generating functional becomes: 
Ze if DE* DA? 8(ViE?)8(V:A%) 


: 1 ] 1 
x Xp ( | d*x Ez — ep By — SY = au) 


(9.81) 


From this, we can read off the Feynman rules for the Yang—Mills theory in 
the Coulomb gauge. We have written the path integral entirely in terms of the 
physical, transverse states. The theory is hence unitary, since all longitudinal 
modes with negative norm have been explicitly removed. 

There is, however, another form for the Coulomb path integral that is more 
covariant looking and whose Feynman rules are easier to work with. Both forms 
of the Coulomb path integral, of course, yield the same S$ matrix. If we start with 
the original action in Eq. (9.60) written in terms of the auxiliary field F,,,, then 
the path integral can be written as: 


Z(J) = i DA® DF, 8(V;A2) Arp(A)exp (isa, yi fats J Ay) 


(9.82) 
where the Faddeev—Popov factor for the Coulomb gauge V; A? = 0 can be written 
as: 

Arp(A) = det SViAT ON om [M(x, y)] 9.83 
Fp(A)= 50) ),, = de Ry (9.83) 
where: 


M(x, y) = D™(A)6*(x — y) (9.84) 
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To calculate the Feynman rules from this action, we simply note that we can 
write: 


oP ABR 4 1°?) (9.85) 
where: 
= ef" 5 ACV (9.86) 
Now we use the expression: 
detM = exp(Tr log M) = [det V"] (det(1 + L)) 


~ exp[Tr log(1 + L)] 
_yy-l 
exp yy 2 


if Gai diay) ds TOG ee 


x 


(9.87) 


where we have thrown away a factor of det V*, which contributes closed loops 
that do not couple to anything. 

In summary, we have shown that there are two equivalent ways of writing the 
Coulomb gauge. The first, with all redundant ghost modes removed, is explicitly 
unitary (but difficult to calculate with). The second form, although not manifestly 
unitary, has a covariant form. The only correction to this covariant form is the 
determinant factor, which we can see from the previous expression consists of 
nothing but closed loops coupled to the A,, field. Thus, from a calculational point 
of view, the only correction to the Feynman rules is to insert closed loops into the 
theory connected to gauge fields. 


9.5 The Gribov Ambiguity 


There is, however, one tricky point that we have glossed over. As we mentioned 
earlier, the operator D2°(A) in Eq. (9.76) is actually not invertible, and hence 
the Coulomb gauge does not, in fact, completely eliminate the gauge degree of 
freedom of the theory. 

This unexpected result is due to the fact that the Coulomb gauge V; A? = 0 
does not uniquely fix the gauge; that is, it is possible to find another A}, which is 
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gauge equivalent to A,, that satisfies: 
VA 0 (9.88) 


To see this, we write Aj out in more detail: 
A Gag = (DO (9.89) 


The important point is that A‘ contains a term proportional to 1/g. Perturbation 
theory, as a power expansion in g, will never pick up this factor. As an example, 
take the group SU(2), and take A; to be gauge equivalent to the number 0: 


A; = AH) (9.90) 


with V; A? = 0. If the Coulomb gauge were a good gauge, then the only solution 
of this equation should be A; = 0. However, this is not so. For example, we can 
parametrize the previous gauge using radial coordinates: 


Q = cos a(r)/2+ie-nsinw(r)/2 (9.91) 


where nin’ = 1 and n! = x'/r. 
Then the Coulomb gauge condition becomes: 


aw dw 
— + — 


i ap sn2a=0; tedogr (9.92) 


which is the equation of a damped pendulum in a constant gravitational field. If 
w = 0, then A; = 0, and this is the solution we desire. The problem, however, is that 
there are obviously many other solutions to this equation other than w = 0, so the 
uniqueness of the Coulomb gauge-fixing procedure is violated. For a nonsingular 
solution, we want w = 0, 27,47,... at tf = —oo. But then the pendulum can 
either fall clockwise or counterclockwise many times and then eventually wind 
up at a position of stable equilibrium at w = —z. For these nontrivial solutions at 
t — oo, this means that we have the asymptotic condition: 


X= +i—— (9.93) 


Because the Coulomb gauge does not uniquely fix the gauge, we will, in 
general, find an infinite sequence of identical copies, each related by a gauge 
transformation, each satisfying the Coulomb constraint. We will call these Gribov 
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copies.* Another way of saying this is that M?"(x, y; A) is a matrix that has a zero 
eigenvalue. Not only do we have nonzero eigenvalues A,,: 


My? (x) = anys (9.94) 
but we also have eigenfunctions with zero eigenvalue: 
M*" op (x) =0 (9.95) 
Since the determinant of M can be written as the product of its eigenvalues: 


det M@ = 1 a I] 0=0 (9.96) 


the determinant itself is zero and hence M is noninvertible. Thus, M@? cannot be 
inverted. As a consequence, the formulas we have given for the Coulomb gauge 
are actually slightly incorrect. The presence of these zero eigenvalue functions 
n(x) spoils the inversion process, and hence spoils the Coulomb gauge fixing. 
Thus the canonical quantization program for gauge theory, based on quantizing 
the physical fields, does not exist, technically speaking. 

Although we cannot fully fix the gauge with the choice V; A? = 0, we can still 
salvage our calculation in the Coulomb gauge. In fact, we can show, to any order 
in perturbation theory, that these zero eigenvalues do not affect our perturbative 
results. The reason we can ignore these states with zero eigenvalues is that they 
do not couple to the physical Hilbert space. For example, we can show: 


(V? f*|dn) =0 (9.97) 
To see how this helps us, let us construct a modified propagator: 


D%”(A)G""(x, ys A) = 88x — y) — DPA ()O5"(y) (9.98) 


With these zero modes explicitly subtracted out, we can define the inverse 
operator G2 without any problems. The perturbation theory with G” can be 
used because: 


GV f =AV-f (9.99) 


because A [in Eq. (9.77)] and G only differ by the zero eigenvalues ¢%, which 
vanish when contracted onto V7 f. 

The moral of this exercise is that, although the Coulomb gauge is riddled with 
Gribov copies, we can safely ignore them as long as one stays within perturbation 
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theory. These Gribov copies carry an additional 1/g dependence beyond the first 
copy, and hence cannot be picked up in perturbation theory. Once one leaves 
perturbation theory to discuss nonperturbative phenomena, such as confinement, 
then presumably these Gribov copies become important. 

The fact that these Gribov copies can be ignored in perturbation theory is 
fortunate, because the Faddeev—Popov quantization procedure becomes difficult 
to work with in the presence of these copies. To see how the Faddeev—Popov 
program is modified, let us enumerate the Gribov copies with an index n. Then 
are an infinite number of solutions to the equation: 


VA =0 (9.100) 


The Faddeev—Popov determinant presumably becomes modified as follows: 
> Applag” = i, DOS (V; A?) (9.101) 


Then the generating functional becomes: 


—i | 
Zi) = i DA, (x ajhiage) 5 (var) 
x exp (is +i ‘i d‘x ae) (9.102) 


It is quite difficult to extract Feynman rules from a path integral as complicated 
as this. Thus, the standard ghost/loop interpretation of the Faddeev—Popov factor 
is now lost, and we have difficulty setting up the Feynman rules. However, as we 
have stressed, in perturbation theory the part analytic at g = 0 is only a function 
of the first Gribov copy, and so we can throw away all the infinite contributions 
from the ambiguity as long as we stick to perturbation theory. 


9.6 Equivalence of the Coulomb and Landau Gauge 


In general, the Green’s functions of a gauge theory are dependent on the gauge. 
However, because the Green’s functions are not directly measurable, this does not 
cause any harm. But the S matrix, because it is, by definition, measurable, should 
be independent of the gauge. 
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We will now present a functional proof that the S$ matrix is independent of the 
gauge. We will first establish how the generator of the Green’s functions changes 
when we go from the Coulomb gauge to the Landau gauge, and then show that 
these modifications vanish when we go on-shell. We stress that our proof easily 
generalizes to the arbitrary case, proving that the S$ matrix for gauge theory is 
gauge independent. This will prove useful in the next chapter, where we discuss 
gauge theory in different gauges. In one gauge, the “unitary” gauge, the theory 
is unitary but not manifestly renormalizable. In the other, the “renormalizable” 
gauge, the theory is renormalizable but not manifestly unitary. Because the S 
matrix is gauge independent, this will show that the theory is both unitary and 
renormalizable. 

Our starting point is the generator of the Green’s functions in the Coulomb 
gauge: 


Za = / DA, AclAul | [6(Vi Ai) exp (istanasi fats Ay) 
: (9.103) 


where Ac[A,,] is the Faddeev—-Popov measure coming from the Coulomb gauge. 
Using functional methods, we will now change the gauge to the Landau gauge. 
We will use the fact that the number “1” can be written, as usual, as: 


1=A;[A,] i DAx) | [5 (0% AN) (9.104) 


where this expression is written in the Landau gauge. 

We will now insert the number “1” (written in the Landau gauge) into the 
expression for the generator of Green’s functions (written in the Coulomb gauge). 
This insertion does not change anything, so the generator becomes: 


Zc(J) = / DA, ( / Dx) | [6 (8“ A(x) ALtAul)ActAul 
x [] svi Adexp (is4,)+3 if atx JAy) (9.105) 


The insertion is contained within the first set of parentheses. Next, in order 
to remove the (2 in the previous expression, we will make an inverse gauge 
transformation on A,: 


My = (9.106) 


We will now use the fact that 5(A,,), the Faddeev—Popov determinant, and the 
measure DA, are all gauge independent. The term 4 (“ A’) loses its dependence 
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on 9, as desired. Thus, all terms involving the Landau gauge lose their dependence 
on 9. On the other hand, we must carefully keep track of those terms in the 
Coulomb gauge that are affected by this transformation. Then the expression for 
Zc(J) becomes: 


Zc) = / DA, Ax(A,) | [6 (8“4,) exp [FS(A,)] 


x(Ac(4y) f DOT] (V:A?"’) )exp (: f atesnate’) 


(9.107) 


Almost all the dependence on the Coulomb gauge is now concentrated within 
the first set of large parentheses. Our next step is to show that the factor appearing 
within these large parentheses can be set equal to one, thereby eliminating almost 
all trace of the Coulomb gauge. To analyze the term within these parentheses, we 
first observe that the integration over DQ contains the delta function, which forces 
us to pick out a particular value for Q~!, which we will call Qo. This specific 
value of 9, because of the delta function constraint, must satisfy: 


Vac (9.108) 


In other words, we can rewrite the delta function term as: 
feo] (via?) = Ao | DOS (Q7! — QA) (9.109) 


where Ao and Qo are the terms that we must calculate. In general, this task is not 
an easy one, because {29 is a function of A; itself. However, we can determine Qo 
because it must, by construction, satisfy: 


Vi AP? = Vi { [Ai — ig“ !05"ViNo] eel =0 (9.110) 


This expression cannot be solved exactly for A,” However, we can always power 
expand this expression to find the perturbative solution, which is given by: 


1 


Q% 
A; = (5, = Vi v2 


vi) A; + O(A’) (9.111) 


Thus, to any order of accuracy, we can solve for Am and (29. The advantage of 
introducing this expression for Qo is that we can now show that the object in the 
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first set of large parentheses in Eq. (9.107) equals one: 


5 (ViA? ‘@)) 
8Q(y) 


Ac(A,) / boys (viAr"') Ac(A,) det 


x,y 


Ac(Ay) Ac fA] (9.112) 


where we have used the fact that Arp is gauge invariant. Thus, the term in the 
large parentheses can be set equal to one, and the expression for Zc¢(J) can be 
written as: 


Zets)= f DA, Ai (Ay) | [5 (8"A,) exp (iscayy+i f ats rate) 


(9.113) 


The important point here is that we have lost all dependence on the Coulomb gauge, 
except for the term J WAT. This means that the Green’s functions, as expected, 
are all gauge dependent. The next task is to show that the S matrix is independent 
of the gauge choice, even if the Green’s functions are gauge dependent. To do 
this, we must extract out the dependence on Coulomb gauge parameters from the 
Landau gauge parameters. The source J,, in the functional, because it originally 
came from the Coulomb gauge, satisfies: 


Jo = Vi Ji =9 (9.114) 


We may therefore write: 
i Age / d‘x J“ F,(A) (9.115) 
where F is defined, to lowest order, as: 
F(A) = Ay (x) + O(A2) (9.116) 


Now let us extract out all dependence on the Coulomb gauge out of the 
generator of the Green’s function: 


5 
Zc(J) = exp ji fates (--)| Zi), (9.117) 


We now have a functional expression relating the generating functional of the 
Landau gauge to the generating functional of the Coulomb gauge. Next, we must 
investigate the dependence of the S matrix on the gauge-dependent parameters. 
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In order to compare our results for the S matrix, we must go on-shell; that 
is, i.e., we must set the p* — 0 on the external legs of any graph. Then, if we 
look at the contribution of the F term to any Feynman graph in this on-shell limit, 
the only terms that survive are self-energy corrections (i.e., radiative corrections 
to the propagator). In Chapter 7, we showed that the net effect of self-energy 
corrections is to give us a multiplicative renormalization of the overall diagram. 
This renormalization, in turn, can be absorbed in the overall renormalization of the 
S matrix that must always be performed when calculating radiative corrections. 
Thus, the S matrices calculated in the Coulomb or Landau gauges only differ by 
a multiplicative constant, given by the renormalization of the self-energy parts, 
which in turn can be absorbed into the overall renormalization of the S matrix. We 
therefore find that the two formalisms give the same on-shell S matrix, as desired. 

In summary, we have seen how the path integral method gives us a convenient 
formalism in which to quantize gauge theory. The only complications are the 
Faddeev—Popov ghosts, which arise from the functional measure of integration. 
The power of the path integral method is that we can rapidly move from one 
gauge choice to another in gauge theory. This is crucial in order to show that 
gauge theory is both unitary and renormalizable. 

In the next chapter, we will construct a realistic theory of the weak interac- 
tions from gauge theory. The essential ingredient will be spontaneous symmetry 
breaking, which allows us to have massive vector mesons without spoiling renor- 
malization. 


9.7 Exercises 


1. Choose the gauge A{ = AS. Is this a legitimate gauge? If so, construct the 
Faddeev—Popov ghost term. 


2. Quantize the theory in the gauge: 
0, A" +cAyAr=0 (9.118) 


Calculate the Faddeev—Popov ghosts. 
3. Repeat the analysis for the gauge: 


0, Ad” Ay = 0 , (9.119) 


4. Calculate the propagator in the gauge n,, A“ = 0, where n,, is a constant and 
normalized to n? = 1. 
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10. 
Le 
123 


Analyze whether the axial gauge A3 = 0 suffers from a Gribov ambiguity or 
not. (Hint: see whether the gauge constraint is invertible or not.) Do other 
gauges suffer from a Gribov ambiguity? Discuss the delicate points involved 
with this problem. 


. A spin-3/2 field w,, has both vector and spinor indices (we will suppress the 


spinor index). Its action is given by: 
ee Wars Yv Op Wo (9.120) 
Show that it is invariant under a local gauge symmetry dy, = 0,,a@, where a is 


a spinor. Break the gauge by adding — s(v -y) Wy - w) to the action. Show 
that the propagator equals y, Ky,/k?. 


. If the spin-3/2 field also carries an isospin index, can it be coupled to the 


Yang-Mills field in a gauge covariant fashion? Justify your answer. 


. Consider an SU(N) Yang-Mills field coupled locally to a multiplet of massive 


mesons in the adjoint representation of SU(N). Write down the Feynman rules 
for the scalar—vector interaction vertices. 


. From a canonical point of view, the purpose of the Faddeev—Popov ghost is 


to cancel the ghost modes coming from the Yang-Mills propagator. In the 
Landau gauge, prove that this cancellation occurs at the one loop level for 
vector meson scattering. 


Prove Eq. (9.34). 
Prove that Eq. (9.78) solves Eq. (9.75). 


By repeating the same steps used to show the equivalence between the 
Coulomb and Landau gauge, show the equivalence between any two gauges 
allowed by the Faddeev—Popov formalism. 


Chapter 10 
The Weinberg—Salam Model 


If my view is correct, the universe may have a kind of domain structure. 
In one part of the universe, you may have one preferred direction of the 


axis; in another part, the direction of the axis may be different. 
—Y. Nambu 


10.1 Broken Symmetry in Nature 


In nature, a variety of beautiful and elegant symmetries surrounds us. However, 
there are also many examples of symmetries in nature that are broken. Rather 
than put explicit symmetry-breaking terms into the Hamiltonian by hand, which 
seems artificial and unappealing, we would like to break these symmetries in a 
way such that the equations retain their symmetry. 

Nature seems to realize this by exploiting the mechanism of spontaneous 
symmetry breaking; that is, the Hamiltonian is invariant under some symmetry, 
but the symmetry is broken because the vacuum state of the Hamiltonian is 
not invariant. The simplest examples come from solid-state physics, where the 
phenomenon of spontaneous symmetry breaking is quite common. Consider a 
ferromagnet, such that the atoms possess a spin o;. Although the Hamiltonian 
does not select out any particular direction in space, the ground state of the theory, 
however, can consist of atoms whose spins are all aligned in the same direction. 
Thus, rotational symmetry can be broken by the vacuum state, even when the 
Hamiltonian remains fully symmetric. To restore the symmetry, we have to heat 
the ferromagnet to a high temperature T, where the atoms once again become 
randomly aligned. 

In addition, spontaneous symmetry breakdown may also be associated with 
the creation of massive vector fields. In the theory of superconductivity, for 
example, spontaneous symmetry breaking occurs at extremely low temperatures, 
giving us the Meissner effect, in which magnetic flux lines are expelled from the 
interior of a superconductor. However, the magnetic field penetrates slightly into 
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Figure 10.1. The first potential corresponds to a unique vacuum, or a Klein-Gordon field 
with a positive mass. The second potential, which exhibits spontaneous symmetry breaking, 
corresponds to a scalar field with a tachyon mass. 


the medium, so there is a finite-range electromagnetic field, which corresponds to 
a “massive photon.” We conclude that spontaneous symmetry breaking can, under 
certain circumstances, give mass to a massless vector field. 

Similarly, spontaneous symmetry breaking gives us a solution to the problem 
that faced physicists trying to write down a theory of the weak interactions in 
1950s and 1960s. One candidate was the massive vector meson theory. However, 
the massive vector meson theory was known to be nonrenormalizable by simple 
power counting arguments. Spontaneous symmetry breaking, however, solves 
this problem. It preserves the renormalizability of the original gauge theory even 
after symmetry breaking, giving us a renormalizable theory of massive vector 
mesons. 

To illustrate spontaneous symmetry breaking, let us begin our discussion with 
a scalar field with a $* interaction which has the symmetry ¢ — —¢: 


1 1 Xr 
nes jh 2 ee 
B 5 oud to) 5m od un? (10.1) 


If m? is negative, then m is imaginary; that is, we have tachyons in the theory. 
However, quantum mechanically, we can reinterpret this theory to mean that we 
have simply expanded around the wrong vacuum. 

In Figure 10.1, we see two potentials, one described by m*$7/2 + 1g*/4! 
with positive m”, which gives us a unique vacuum, and another with negative m2, 
which corresponds to a tachyon mass. 

For the second potential, a particle would rather not sit at the usual vacuum 
@ = 0. Instead, it prefers to move down the potential to a lower-energy state given 
by the bottom of one of the wells. Thus, we prefer to power expand around the 
new minimum @ = v. 
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(Classically, a rough analogy can be made with a vertical, stationary rod that 
is hanging from the ceiling. Normally, the rod’s lowest-energy state is given by 
6 = 0. If we displace the rod by an angle 0, then the potential resembles the first 
figure in Fig. 10.1. However, if the rod is sent spinning around the vertical axis, 
then it will reach a new equilibrium at some fixed angle 6 = 6p, and its effective 
potential will resemble the second figure in Fig. 10.1. For the spinning rod, 6 = 0 
is no longer the lowest energy state.) 

For the scalar particle we discussed before, if m7 is positive, then there is only 
one minimum at the usual vacuum configuration ¢ = 0 and the potential is still 
symmetric; that is, it has the symmetry ¢ — —@. 

However, if m? is negative, then quantum mechanically the vacuum expecta- 
tion value of the @ field no longer vanishes because we have chosen the wrong 
vacuum. For the tachyonic potential, we can easily find the location of the new 
minimum: 


dp =v = +V¥—6m?/2r (10.2) 


Normally, we demand that the vacuum expectation value of the scalar field 
vanishes. However, if the naive vacuum is the incorrect one, then we find instead: 


(0||0) = v (10.3) 


Clearly, our troubles have emerged because we have power expanded around the 
wrong vacuum. To correct this situation, we must shift the value of the @ field as 
follows: 


g=g-v (10.4) 


In terms of the shifted field @, we now have broken the original symmetry @ — —@ 
and: 


(0|$|0) = 0 (10.5) 
We also have a new action given by: 
Dae aii hems 4 
5 pa p+m p — Gh ——Ad@ (10.6) 
Because m? is negative, we have an ordinary scalar particle with a positive mass 


squared given by —2m?. The original symmetry between ¢ and —@¢ has now 
been spontaneously broken because the field has been shifted and there is a new 


vacuum. 
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Figure 10.2. The new solution to the minimum of the potential corresponds to a ring of 


solutions. 


Let us now examine a less trivial example given by the global O(N) scalar 
theory, where the field ¢' transforms as an N-vector: 


1 : F i -. Xr Kong 
Z = dug! dg!" — m'g'd! — Tb'g'y (10.7) 


Again, if the mass term has the wrong sign, (i.e., if m? is negative), then there 
is anew vacuum solution given by: 


¢'g' =v? = —6m?/A (10.8) 


Contrary to the previous potential that we examined, which had two minima, 
this theory has an infinite number of vacua. In Figure 10.2, we can see that there 
is actually a degenerate ring of solutions sitting in the bottom of the potential well. 

Any solution of this equation yields a new vacuum. Now let us break this 
degeneracy by singling out a specific direction in isospin space, the last entry in 
the column vector: 


0 
0 
()o=] © | =@) (10.9) 
Uv 
where v; = Oifi =1,---,N — land vy =v. 


This new vacuum is still invariant under the group that leaves the last entry 
Un = v invariant, that is, the subgroup O(N — 1). We are thus breaking the original 
symmetry group O(N) down to O(N — 1) with this choice of vacuum. 
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Let t' be the generators of O(N). These generators, in turn, can be divided 
into two groups: the generators of O(N — 1), and everything else. Let t/ equal 
these remaining generators. The simplest parametrization of this new vacuum is 
to let the last component of the ¢' field develop a nonzero vacuum expectation 
value: dy — oy + v, keeping the other ¢; the same. However, we will find 
it convenient to introduce a slightly more complicated parametrization of the N 
fields within the vector ¢'. We will choose instead: 


0 
0 


gb = efbiti/” 0 (10.10) 


v+oa(x) 


where we have replaced the original N fields ¢' with a new set, given by N — 1 
fields &; and by o. 

To see the reason for this particular parametrization, we recall that O(N) has 
(1/2)N(N — 1) generators. The number of generators in O(N), after we have 
subtracted out the generators of O(N — 1), is: 


SN(N = 1) 5(N ~1XN -2)=N-1 (10.11) 


Thus, there are N — | generators f' that are not generators of the O(N — 1) 
subgroup. 

A particularly useful parametrization for the generators of O(N) is given by a 
series of delta functions: 


(tier = —t(bixdj1 — 8:15 jx) 
—1(d;xdn1 — idx) (10.12) 


(Tin Del 


To lowest order, we have ¢; = & fori < N and dy = v +0, and the action 
becomes: 
1 
= 5 (0,00%0 + 0,&8"&;) 
1 
_ sv + oy - qty Poy oe higher terms (1043) 


The N — 1 &; fields have become massless, and o is now massive but no longer 4 
tachyon. The action is still invariant under a residual symmetry, O(N — 1). 

We emphasize that the number of massless bosons ; is equal to the number of 
broken generators t'. We call these massless bosons, which signal the spontaneous 
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breakdown of the theory, Nambu—Goldstone bosons.'~? These bosons will play a 
special role in constructing realistic gauge theories of the weak interactions. 
In summary, we have: 


O(N) — O(N —1) = WN —1 Nambu-Goldstone bosons (10.14) 


that is, the breaking of O(N) symmetry down to O(N — 1) symmetry leaves us 
with N — 1 Nambu-—Goldstone bosons &;, or one boson for each of the broken 
generators T!. 


10.2 The Higgs Mechanism 


We now have a rule that the number of massless Nambu—Goldstone bosons equals 
the number of broken generators of the theory. Let us discuss a more general 
action to see if this result still holds: 


5 ud 0" b — V9) (10.15) 


where ¢ transforms under some representation of a group G, which has N gener- 
ators. 

Let us say that there is a nontrivial vacuum, which we can find by calculating 
the minimum of the potential: 


ad = (10.16) 
dg; |o=v 

Let (¢;)o = v; be a solution of this equation that minimizes the potential. We find 
that this new vacuum is still invariant under a subgroup of G, called H, which has 
M generators. This means that there are M generators Li that leave v; unchanged; 
that is, they satisfy: 


L4v; =0 (10.17) 


There are also N — M generators L? for which Li v; #0. In other words, the 
generators L?, are of two types: L*,, which generate the subgroup H, and Li, 
which are all the remaining generators. The first set of generators annihilate on 
v;, by construction, while the second set does not. 

Next, we wish to show that expanding around this new vacuum will create 


N — M massless boson fields. To begin, we define the variation of the scalar field 
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as: 
So = —i0° Lb (10.18) 


The potential V() is invariant under this transformation, so we have: 


dV ae 
ae i? Lio; = (10.19) 


Since the parameters 6° are arbitrary, we have N equations: 


acta 
5g) Lids = 0 (10.20) 


Differentiating this, we arrive at: 


ae bV 
ee, UY! ey 


i= (10.21) 
Substituting the value of ¢ at the minimum into the previous equation, we find: 


62V 
L2y,=0 10.22 
5Gi8di lew te 


If we Taylor expand the potential around the new minimum @; = v;, then the mass 
matrix can be defined as: 


1 
V(p) = —5(M*)(@ — v)i(@ — v)j +--- (10.23) 


Now let us insert the previous equation for the mass matrix into Eq. (10.22). 
This gives us: 


(M*);;L7, v4 = 0 (10.24) 


This equation is trivially satisfied if L“ is a generator of the subgroup H, since 
Li v; = 0. The situation is more interesting, however, when the generator is one 
of the N — M generators Li of G that are outside H. 

For these N — M seeanns Lt. , Eq. (10.24) is an eigenvalue equation for 
the matrix M?. It states that for ent of the N — M generators Li , there is a zero 
eigenvalue of the M* matrix. Since the eigenvalues of the M* matrix give us the 
mass spectrum of the fields, we have N — M massless bosons in the theory. 

Thus, after symmetry breaking there are N — M massless Nambu-Goldstone 


bosons, one for each broken generator. 
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In summary: 
G— H => N-—M Nambu-Goldstone bosons (10.25) 


A surprising feature of this method, however, arises when we apply the 
Nambu-Goldstone theorem to gauge theories. In this case, will find that these 
Nambu-—Goldstone bosons are “eaten up” by the gauge particles, converting mass- 
less Yang—Mills vector particles into massive ones. This is called the Higgs—Kibble 
mechanism.*—® 

Furthermore (and this is the key point), the Yang—Mills theory remains renor- 
malizable even after the Higgs mechanism has generated massive vector particles. 
In other words, this is the long-sought-after mechanism that can render a massive 
vector theory renormalizable. It is not an exaggeration to say that this discovery, 
by ’t Hooft, changed the landscape of theoretical particle physics. 

To see how the Higgs mechanism works, let us begin with a theory of complex 
scalar particles coupled to Maxwell’s theory. The action is: 


& = Dyb* D“o — m*o*o — Uo" oy — a FF (10.26) 
The coupled system, as before, is invariant under: 
¢ + e!Hg 
ot 3 etibeg* 
Ay — Ayr 3,00) (10.27) 


The action is invariant under U(1) = SO(2). 
Now we break this symmetry; the new vacuum is given by: 


(o)o = v/v2 (10.28) 


We will find it convenient to parametrize the complex field ¢ by introducing 
two fields o and &: 


@ = e§/(y+a)/V2 
= (vtot+i€t--)/V2 (10.29) 


We can now make a gauge rotation on both ¢ and A,: 


6 > ¢ =e 8g =(vt+o)/V2 


1 
Ay > A, =Ay— ays (10.30) 
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(We have gauge rotated the & field away in the definition of $’ and A/,.) 
The final action now reads: 
1 


, l 155 
eae ee 5 Io do + xe Ai, AM 


1 , ; 
(10.31) 


The A’, field has become massive. The key point is that the € has disappeared 
completely. In other words, it has been “gauged away,” or “eaten up” by the vector 
field. 

The Nambu—Goldstone boson, corresponding to the breaking of U(1) invari- 
ance, has disappeared and reappeared in a new guise, as the massive component 
of the massless vector field. Since U(1) is now broken, there is no longer any 
gauge symmetry left to prevent the Yang—Mills field from acquiring a mass, and 
hence it acquires a massive mode (at the expense of the € mode, which vanishes). 
Thus, the new field A/, has three components (while a massless vector field has 
two helicities or transverse polarizations), with one of these fields corresponding 
to the old & field. To see that the A’, fieid has gobbled up the  Nambu—Goldstone 
boson, we note that its definition was: Aj, = A, —(1/ev)d,é, which clearly shows 
that the € field has been incorporated into the A/, field. 

Let us now discuss two slightly more difficult examples of the Higgs mecha- 
nism with non-Abelian gauge fields. Then we will see that only some of the gauge 
fields become massive. First, let us discuss a SU(2) gauge theory coupled to an 
isovector triplet of scalar fields. We will break this down to U(1), so that only one 
of the three gauge fields remains massless, while the other two acquire a mass by 
eating up the Nambu-—Goldstone bosons. 

The action is: 

Zz 5 Pad D*¢' — V(¢'¢') — aa (10.32) 

As before, we will choose V(¢'¢') such that there is a tachyon in the theory, 
meaning that we have chosen the wrong vacuum. When we shift to anew vacuum, 
we find that it is degenerate. We will break this degeneracy by shifting the third 
component of the isovector: 


0 0 
Giga: poe" =a ap (10.33) 
Vv v+oO 


where L! are the generators of SU(2), & are the Nambu—Goldstone bosons, and 
we choose the third isospin generator L? to be the generator of the unbroken U(1) 


symmetry. 
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Next, we make a local gauge transformation on the fields, such that we remove 
the & fields appearing in the previous equation for @: 


o + $=% 
iA L Aj, = OL Ay! — (3,90 (10.34) 


where: 
Q) = WIG L' Hab )/0 ; (10.35) 


Let us expand the action around the new vacuum. The vector fields now have 
the following action: 


1 1 
ce + vaya AWA) (10.36) 
Two of the gauge fields have acquired a mass, but A>, corresponding to a residual 
U(1) symmetry, is still massless. Again, the number of Nambu—Goldstone bosons 
equals the number of broken generators. 

The other terms in the action are: 


1 
7 ono ao — V(v +o) + higher terms (10.37) 


The &;,2 fields, as expected, have disappeared completely by being absorbed into 
the gauge fields. 

Finally, we must now show that the Higgs mechanism works for an arbitrary 
gauge theory. The important feature that we want to demonstrate is that the number 
of gauge fields that become massive is equal to the number of broken generators 
of the gauge group. 

We start with a gauge group G, which has N generators, and hence N gauge 
fields Aj. We also have the real scalar field @ transforming under some n- 
dimensional representation of the group G. We start with the action: 


= — 7 Fy Fhe + 5 (2, — igh" Ae” — igh’ A“”)o —V(d) (10.38) 
where L“ is ann xX n representation of the group G, that has N generators. 

Let us choose V(@) so that symmetry breaking occurs. Let ¢ = v be a 
matrix equation defining the minimum of the potential. We want a vacuum that 
is invariant under an M-dimensional subgroup H of G. The generators of H, 
because they leave the vacuum invariant, satisfy L“v = 0. 
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We now parametrize the scalar field as: 


N-—M 
p = exp ( YS él p>) (v+0) (10.39) 
i= 


where we sum over the N — M generators that do not correspond to the subgroup 
H and do not annihilate v. For these generators L'v + 0. Also, & are the 
Nambu-Goldstone bosons. 

The trick is now to choose a gauge transformation which swallows up the & 
fields appearing in the above definition. As before, we choose: 


N-—M 
¢ — ¢' =exp (~ S &,L' y) $=26 (10.40) 
t=] 


The QQ in front of ¢ precisely cancels against the Q~' appearing in front of the 
parametrization of ¢, so the &; fields disappear. 

Inserting this parametrization into the action, we can collect the terms respon- 
sible for the vector meson masses, which is given by: 


1 
Pi 


ee | ere ha a 
Ai, (M?)'/ Al# = 5 (gL'v|gL/v) Ai, AM (10.41) 


where the brackets represent the matrix contraction or scalar product between two 
vectors. 

Thus, the masses of the gauge fields are given by the eigenvalues of the 
following matrix: 


(M7) = 9 (v|L' Lv) (10.42) 


There are N — M non-zero eigenvalues to this equation, and hence there are 
N — M massive vector fields (which have absorbed the remnants of the N — M 
Nambu-—Goldstone bosons). 

In retrospect, the Higgs mechanism has a rather simple interpretation. We 
know that a gauge theory locally invariant under a group G has no mass term in 
the action for the vector field in perturbation theory. However, if the group G 
breaks down to a subgroup H via symmetry breaking, then we know that the gauge 
fields corresponding to the H subgroup must still remain massless. However, the 
gauge fields corresponding to the broken generators of G are now free to become 
massive. 

Let us now present a more formal, model-independent proof of the Nambu-— 
Goldstone theorem. We begin with the observation that spontaneous symmetry 
breaking occurs because the vacuum is not invariant under a certain symmetry, 
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although the action is. In other words, beginning with a symmetry and its con- 
served current J,,, we can construct its conserved charge Q such that: 


C= faxr Q|0) #0 (10.43) 
The essential point is that Q no longer annihilates the vacuum in a broken theory; 


that is, the vacuum state |0) is not the true vacuum, so it is not annihilated by Q. 
Current conservation means that the following commutator vanishes: 


0 = iL dx [d, J*(x), @)] 


do / d?x [J°(x), 6(0)] + / dS - (I(x), 6(0)] (10.44) 
S 


for some scalar boson field ¢(0). Then we make the assumption that for large 
enough surfaces §, we can ignore the term on the right-hand side of the equation. 
Hence: 


= (Qt), $(0)) = 0 (10.45) 
t 
Then: 


(0| [O(z), #(0)] |0) = C 40 (10.46) 


where C is a nonzero constant. Now insert a complete set of intermediate states 
inside the commutator. After making a translation, we can write this expression 
as: 


Y2277'3*(p,){ (O1Y°O)|n)(n|H(0)|0)e-1**" 


—  (01b(0)|n) (nj J°O)|O)e"*" |] = C 40 (10.47) 


In general, unless E,, = 0, the positive and negative frequency parts cannot 
possibly cancel, and hence the contributions to the sum cannot give us a constant. 
In other words, it is impossible to satisfy the previous equation unless the mass of 
the intermediate states vanishes. (Furthermore, these massless states must exist, 
so that the sum adds up to a non zero value.) Thus, there must be a massless 
Nambu-Goldstone state in the theory, with the property that: 


(n|@(0)|0) 40, (| °(O)|n) 40 (10.48) 


This completes the proof. 
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10.3. Weak Interactions 


Now that we have investigated the possibility of spontaneous symmetry breaking, 
we can apply this theory to the weak interactions, where it has enjoyed great 
success. 

To appreciate this breakthrough, we must realize that, from an historical point 
of view, important progress in the field theory of weak interactions was relatively 
stagnant for many decades, from the original Fermi action’ of the 1930s, to the 
overthrow of parity, to the advent of gauge theories in the 1970s. 

Fermi originally tried to explain the decay of the neutron into a proton, an 
electron, and an anti-neutrino: 


n— ptet+D (10.49) 


by postulating the phenomenological Lagrangian: 
G fags ea 
Fos VAG TV hel Wr) + he. 
aye 


G 
= EM saalln the. (10.50) 
A 


(Experimentally, Gr = (1.16639 + 0.00002) x 10-° GeV~?.) This action, 
from the very start, was known to suffer from a series of diseases. First, the [4 
matrices could in principle consist of all possible combinations of the 16 Dirac 
matrices. The lack of precise experimental data for years prevented a decisive 
determination of the action. It took many decades finally to resolve that the correct 
combination should be V — A,°~? rather than scalar or tensor combinations. 

The Fermi action also suffered from a fatal theoretical disease: It was non- 
renormalizable. Four-fermion interactions must be accompanied by a dimension- 
ful coupling constant (since the dimension of a spinor is 3/2). The Fermi constant 
thus has the dimensions —2, and hence the theory was nonrenormalizable. Finally, 
the theory violated unitarity. If we calculate the high energy behavior of the dif- 
ferential cross section of any weak process, we can write (purely by dimensional 
arguments): 


do Gt 
ot 10.51 
dQ an° ee 
(The differential cross section is a pure s wave because the four-fermion interaction 
takes place at a single point.) 
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Figure 10.3. If four fermions interact via a massive vector meson, then the exchange of 
the vector meson, for a large mass, mimics the original Fermi four-fermion interaction. 


On the other hand, we know from unitarity that the S matrix for s-wave 
scattering must obey the law: 


—— = : (10.52) 


These two results are obviously in contradiction for high energies. The discrep- 
ancy between these two results becomes serious at around \/s ~ 500 GeV. 

To extend the Fermi action, physicists tried to emulate the success of the 
Yukawa theory for the strong interactions. If the Yukawa meson could mediate 
the strong interactions, then perhaps a massive vector meson could mediate the 
weak interactions. The obvious proposal was to treat the Fermi action as a 
byproduct of vector meson exchange (Fig. 10.3): 


C = & Vie k ky /M?2 = v 
B = gy(Vpy"V,) ee (vey W) (10.53) 
W 


This, in turn, gave us a rough determination of the mass of the vector meson: 


Gr &y 
ae (10.54) 
A rough calculation put the vector meson mass at around 50-100 GeV. 

The advantage of the vector meson approach was that it smoothed out the 
singular behavior of the original Fermi action. However, the massive vector meson 
theory was still nonrenormalizable. The propagator behaved as k,,k,, /M?k? in the 
ultraviolet region; so the higher loop graphs all diverged. 

Meanwhile, the experimental confusion in the weak interactions continued 
with the discovery of three identical sets of lepton pairs, corresponding to the 
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electron, muon, and t particles and their corresponding neutrino partners: 


Ch CC) ws 


To this day, no one knows why there are three identical copies of lepton 
families, as well as quark families. The masses of the e, 4, and t leptons in MeV 
are .51099906(15), 105.658389(34), and 1784. Their neutrino masses have upper 
limits given by 10 eV, .27MeV, and 35MeV. 


10.4 Weinberg-Salam Model 


The Weinberg-Salam model,!®'! one of the most successful quantum theories 
besides the original QED, is a curious amalgam of the weak and electromagnetic 
interactions. Strictly speaking, it is not a “unified field theory” of the weak 
and electromagnetic interactions, since we must introduce two distinct coupling 
constants g and g’ for the SU(2) and U(1) interactions. Nonetheless, it represents 
the one of the most important extensions of QED in the past quarter century. 

We begin by discussing the SU(2) sector. Observationally, we must incor- 
porate a neutral, left-handed Wey! neutrino along with a Dirac electron, which 
can be considered to be the sum of left-handed and right-handed Wey] spinors. 
The left-handed fermions form an isodoublet, consisting of the Weyl neutrino and 
electron: 


L= ( Ve ) (10.56) 
€ fh 


while the right-handed sector consists of an isosinglet, the right-handed electron: 
R=(e)pz (10.57) 


This curious feature, that the electron is split into two parts, with the left- and 
right-handed sectors transforming differently, is a consequence of the fact that the 
weak interactions violate parity and are mediated by V — A interactions. 

These two lepton sectors transform under SU (2) in different ways: 


et/2)0-o7 


— 


R = R (10.58) 


336 The Weinberg—Salam Model 
Now let us examine the transformation of these fields under: 


5 e/DBy 


R =4 ePR (10.59) 


R and L transform slightly differently under the U(1) transformation. 

The charge Q of the pair (¥., e) is (0, —1), which is almost equal to the 
eigenvalue of the isospin operator T>. In fact, the correct formula for the charge 
is: 


Y 
Q=T?+ Bi (10.60) 


where T? = +1/2 and Y = —1 for the left doublets, and T? = 0 and Y = —2 
for the right singlets. Thus, we need both the SU(2) and U(1) sectors to get the 
charge correctly. 

The final action consists of three parts. ‘% is the gauge part; .% is the 
fermionic part; and 43 is the scalar Higgs sector: 


B= B+ Fyo+ ZB (10.61) 
where: 
es lye wou lp pw 
Ae 4" 
BZ, = iRy"D,R+ilLy"D,L 
Fs = D,b'D"o — m?b'd — roy 
+ G.(LoR + Ro'L) (10.62) 
where: 
Why = O,Wo —0,Wo+ of? wewe 


Fuv = 0, By a 0, By 
DuR = (8, +ig'B,)R 
DL = [8,+(/2)g'By — (i/2)g01 Wi] L 


Did = [8 —(i/2)g0;Wi, — (i/2)g'B,]¢ (10.63) 
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The scalar multiplet is a complex isodoublet given by: 


= ( o (10.64) 


where the doublet has charge (1, 0), which can be given by Q = T? +(1/2)Y, such 
that 7? =+1/2 and Y = 1. 
Symmetry breaking is induced by: 


(¢) = ’ (10.65) 
alow is 


After symmetry breaking, the fields Wi and B, recombine and reemerge as 
the physical photon field A,,, a neutral massive vector particle Z,,, and a charged 
doublet of massive vector particles lie 


Wi+2'B 
Zii oe = cos Ow W; + sin Ow By 
SeWiceee ; 
1 
i = wie + iW2) (10.66) 
where the Weinberg angle @y is defined via: 
cos Gy 2 Ss 1/2 
(oie 
g’ 
tandy = = (10.67) 


By examining the mass sector, we can read off the masses of the resulting 
vector particles: 


M2, = Mj, 
M2 
eb Ww 
MD; cos’ Ow 
M, = O (10.68) 


Finally, the electric charge emerges as: 


e=gsindw (10.69) 
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Experimentally, the predictions of the Weinberg—Salam model have been tested 
to about one part in 10° or 10*. At present, we have the following values for the 


parameters of the Weinberg—Salam model”: 
sin? Ow = 0.2325 + 0.008 
Mz = 91.173+0.020GeV/c? 
My = 80.22 +40.26GeV/c* (10.70) 


The Weinberg-Salam model has been one of the outstanding successes of field 
theory, gradually rivalling the predictive power of QED. The rest of this chapter 
will be devoted to studying the many consequences of the model. 


10.5 Lepton Decay 


Let us now use the Weinberg—Salam model to do some simple calculations, such 

as the decay of the muon or the t lepton to lowest order. Although the Born term 

resembles the calculation that one might perform with the old massive vector me- 

son theory, the Weinberg—Salam model allows us to calculate quantum corrections 

to the massive vector meson theory that we can then compare with experiment. 
We are interested in purely leptonic decays, such as: 


i > eee (10.71) 
More generally, we can have: 


[a(P1, Sa) — V_(p2) + Va(p3) + lp(pa, So) (10.72) 


where p; = p2 + p3 + pa anda and b represent lepton generations. Thus, (/z, vz) 
and (/,, v,) form two lepton generations. 

A straightforward use of the Feynman rules for the Weinberg—Salam model 
yields, to lowest order: 


—jg2 


7 —gtY 4 gla M2 
g (Ham — ys) (SE ly 


qg-My, 


Mb = ) [uayv(1 — ys)v2] (10.73) 


The decay rate dw is given by: 


54(P; — Ps) 4 |?d3 prd? p3d? py 


dw = (2x) 
ge!) (27)92E|2Ej2E32E, 


(10.74) 
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We can write .4 as follows: 


4 
: 
|. 4? = Eve” (10.75) 
64Mi * 


where: 


Luy = Tr [(usH#3)yu(1 — ys) WC — ys)] 


1 
= 5 Tr [Payal — ys\Gi + ma) +? fa)yo(1 — ¥5)] (10.76) 


where we have used the Gordon identity and have taken My to be larger than the 
momenta gq? and the mass term mms. 
Using the standard identities, we can write: 


- Tr [Psy + may? Fa) V(1 — ys)] 


P3(P1 — Masa)’ Tr [Ye ¥nveYv(1 — ys)] (10.77) 


Similarly, M*” has almost the same structure, so it can be written: 
M”” = (p4 — mpSp)aPrp Tr [y*y*yF y (1 — ys) | (10.78) 


so the final result for .Z is: 
gt 
4)? = ML [p3 (ps — MpSp) P2 - (Pi — MaSq)] (10.79) 
W 


Inserting this expression back into the decay rate, we find: 


4 iv ad v 
8 (pa — MySp)*(P1 — MaSa)” 3 
= dd ive 10.80 
a 16(2)° M4, Ey Ea cad Cae 
where: 
ie d? pz d? p3 p3yPv5*(p — pr — ps) 
ioe EE 


= (Bub? + 2PuPr) (10.81) 


where p = pi — Pa. 
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To simplify this calculation, we will take the rest frame of the decaying lepton 
l,. Then we have the following rest frame decomposition: 


Pi = (m,,9) 
Sa = (0,Sa) 
Pa = (Eg, pa) 
Sp = [P4-Sp/mp, Sp + (Pa - Sp) pa/mp(E4 + mp)] (10.82) 


Then the differential decay rate becomes: 


4 
8° p4 dE4 dQ ee 
d Oana... a st m = 2m, E. m E — « Ss 
a 192(27)*Mima e b 4) | ma(Es — pa * Sp) 
+ mq | pa — mpS, — (D4 S5)P4 \ Ay Cole ee Ny ee 
‘ E,+my @ 


ss ms — E4)(E4 — pa + Sp) + pa - (v. — Mp8, — os) 


E4+ Mp 
(10.83) 
This formidable expression can be simplified if we let m, ~ 0. Let n be a unit 


vector pointing along p4, such that cos@ =s, -n. Let x = E4/E?*, where Ef 
is the maximum allowed value of E4, or m,/2. Then we have: 


_ ms dx dcos0 dp 
O= 32M2 (1920 pl +a(x)cosé][1 —n-s,] ——— (10.84) 
where: 
n(x) = 2x°(3—2x); a(x) =(1 — 2x)/(3 — 2x) (10.85) 


Now sum over the spin states of the bth particle, average over the spin states s,, 
and integrate over dQ. We then find: 


do eg? mn(x) 
dx  32M4, 19273 OES) 
Integrating over x, we find: 
4,,,5 G2 5 
See ae (10.87) 


| ee 
32M4,(192)n3 19273 
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We performed this calculation for small m,. However, the calculation can also 
be performed by adding in the corrections for small € = m,/m,. Power expanding 
in this variable, we find: 


GCG: 5 
T= a oe Pye 1 — 8" 2deMage 8e° Se? + -- = (1088) 
We should point out that this result can be generalized to allow for couplings 
other than V and A, which allows for a test of the accuracy of the electroweak 
theory. We could start with the transition probability: 


Ab = Y° (iaV ius) [as i(gi + 8/75)v2] (10.89) 


where we include all possible 16 Dirac matrices in the transition element, not just 
V -A. 

For m, = 0, the calculation of the decay rate is long but very straightforward, 
and yields!?: 


san = A m>x* dx dQ 
© = 4 192n? 4x 
4 4 
x 461 — x) +4p(x — 1)—€cosé |} 201 el i - 1) 
(10.90) 
eae es| eles alert Eiger 
a’ = 2Re(gsgp + ergs) 
b = |gvP +leyl? +leal? + lea? 
b' = —2Re (gygis + gasy) 
c = lerl +lerl? 
c’ = 2Re(grer) (10.91) 
The Michel parameters are defined as: 
Ap = 3b+6c 
AE = —3a’ —4b' + 14c’ 
& = (3b' —6c’)/(3a’ + 4b’ — 14c’) (10.92) 


where A =a + 4b + 6c. 
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When radiative corrections are included in the calculation, we find, for muon 
decay:!4 


Gym, [ 8m? G25 3 
= See | ieee — 7a eee 10.93 
Y= Toon m2 Me 4, ese ’ 


These and other theoretical calculations have shown good agreement with 
experiment. 


10.6 R; Gauge 


Now that we have described the Weinberg—Salam model, we will study, in this and 
the next section, how to quantize it. We know that massive gauge theories are not 
renormalizable by a simple analysis of the ultraviolet behavior of their propagators. 
[These terms diverge as O(1) as k,, becomes large, which spoils renormalizability. | 
Unlike the situation in QED, we cannot appeal to the Ward identities to eliminate 
the troublesome term in the propagator: k,,k, /[M7?(k? — m?7)]. 

This makes us wonder how spontaneously broken gauge theories can preserve 
both unitarity and renormalizability, which seems totally contradictory. On the 
surface, it seems impossible to preserve both features, which was one reason why 
massive gauge theories were rejected as a model for the weak interactions. 

To see how spontaneously broken gauge theories can be both unitary and 
renormalizable at the same time, we will use the R; gauge,'> which has the 
advantage that it interpolates between two sets of propagators. We will then 
specialize this to the ’t Hooft gauge when we consider the case of the Weinberg— 
Salam model. 

To obtain the R; gauge, we will insert a new term into the action. Let us 
impose the gauge: 


F(A,) = a(x) (10.94) 


on our theory, where a(x) is an arbitrary, real field. We can insert the following 
term into the path integral: 


/ Daca Sexe (10.95) 
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The path integral, with the new gauge constraint, now becomes: 


[ oad Arpé (F(A,) — a) exp ji fats (-3¢ + Zo) (10.96) 


By performing the path integral over a(x), we find that the Lagrangian is altered 
by: 


1 
FY F-—F'F 
= = (10.97) 
In particular, we will insert the gauge fixing term: 
F(A,) = Jé (8“A2) (10.98) 


With this new gauge-fixing term, the action becomes (with a = 1/&): 


1 1 
B= --F,,F” —-—(,A"y 
4° 2g Bu A") 


= 5A [guvd* + (a! — 1) d,a,] A” (10.99) 


Inverting this expression and solving for the propagator, we find: 


1 
ee 


Duy = 


Riks 
(em -c — 0)“ ) (10.100) 


Although the Green’s functions for the theory are all gauge dependent (i.e., de- 
pendent on the parameter a = 1/&), we will find that the S matrix elements 
are all independent of a, which is now seen to be an unphysical artifact of the 
gauge-fixing procedure. 

For various values of a, we have various gauges [see Eq. (4.44)]: 


=1: Feynman gauge 
i aa ani (10.101) 


a=0: Landau gauge 


Now let us discuss the massive case, which describes spontaneously broken 
gauge theories. When the gauge meson develops a mass, the action becomes: 


Z- 5 FTF = aA [etva? + a%a"(a~! — 1) + g4"m*] A, (10.102) 
104 


To find the Green’s function, we need to solve: 


[(a? + m?)g%" — a4 — @7")] AG — ysa)vp = —855"(x — y) (10.103) 
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Inverting this expression, we find that the propagator is given by: 


1 ] 
On a me ———_—_._——_—. } — —.—__ (10.104 
ema) (eu Coa ark) ame 


(The propagator has a pole at k* = am?, which represents a fictitious particle. 
This pole is cancelled in the S matrix by other contributions, which preserves 
unitarity.) 

In the limit a — 0, we find: 


2 

lim Ak, a)» = — 8x ee (10.105) 

In the ultraviolet limit, this propagator is much better behaved than the usual mas- 
sive vector propagator; it goes as O(1/k”), which gives us good power counting 
behavior in the Feynman graphs. The price we pay for such a propagator, however, 
is that the theory is not manifestly unitary. If we take the diagonal elements of the 
propagator, they alternate in sign, an indication that there are longitudinal ghosts. 

In the limit @ — oo, however, we have the propagator: 


Suv — kyky/m? 
im A(k, @)uv = a aa aan ee (10.106) 

This propagator, by contrast, has very bad convergence properties. For large k, it 
behaves like a constant, which is disastrous from a power counting point of view. 
The advantage of this limit, however, from the S matrix point of view, is that it is 
unitary. The 0, 0 component of the propagator, taken in the rest frame, vanishes. 

We now have the strange situation where for a = 0, the theory appears 
renormalizable but not manifestly unitary, but for @ = oo, the theory appears 
unitary, but not manifestly renormalizable. 

In summary: 


a@-— 0:  Renormalizable, not manifestly unitary 
(10.107) 
a@-— oo: Unitary, not manifestly renormalizable 


Although the Green’s functions are € dependent, we know that the S matrix 
must be independent of €, which is a gauge artifact; therefore the theory is both 
unitary and renormalizable. Although this argument is not totally rigorous, the 
R; gauge, because it smoothly interpolates between two gauges, intuitively shows 
how unitarity and renormalizability are complementary, not contradictory. To 
see how to apply this to the Weinberg—Salam model, we now define the ’t Hooft 
gauge. 
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10.7 ’t Hooft Gauge 


One of the most convenient gauges to quantize the Weinberg—Salam model are 
the ’t Hooft gauges,'° which reveal the close link between unitarity and renormal- 
izability. We begin by analyzing a O(3) gauge theory coupled to a triplet of Higgs 
bosons: 


ry i a apv 1 j ul 
Bi Fight + 5nd: + geijcAl gy —VO'd') (10.108) 


where (r') jk = —1€;;, form the adjoint representation of O(3). 
As before, we will parametrize the Higgs bosons via: 


. 0 
L 
@ = exp Gea has nn) 0 
vtn 
7) 
= —§, | +higher terms (10.109) 
v+n 


We are replacing the original triplet of Higgs bosons ¢', ¢”, ¢° with the set 
Er, Eo, 1). 


Now let us substitute this new parametrization of the Higgs sector into the 
original action. After a bit of algebra, we find that the action can be written as 
the sum of four pieces. The first piece is the usual gauge action plus scalar fields, 
with a massive n field: 


] 1 
ea Gi) 5 (aay * (8,82)° + (8.n)°] + Men? (10.110) 


The second term involves a cross-term between the A,, and &, field that we want 
to eliminate: 


B= M (Aj, a"é, + A270") (10.111) 


The third term consists of the interactions between the gauge field and the scalar 
fields: 


| 1 L 
A= (Gu +g7unt 8°) (A), Al” + Aj, A) + gn(Aj,0“E, + AZ,0%E,) 
(10.112) 
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The last term contains the scalar self-interactions: 


1 3M2 3 
B= 242, 4. =e 4 ey 
x gus + el ae 


2) 2 
Mion ea a Msg (10.113) 


2v2 4v 
where &? = &? + &?. 

Our goal is to choose a gauge in which the second term “2 with the gauge— 
scalar coupling vanishes. To kill this term, we choose the gauge: 


Lor = -5 (a,A% —aME,) (10.114) 


where M = gv. The cross term coming from the gauge fixing kills 73 after a 
partial integration. 

Now let us write down the propagators for the various fields (disregarding the 
ghosts): 


8uv — (1 — akyky/(k? — aM?) 
I 


il 2 — M? + ie 
fa goa 
i = ae (10.115) 


In the limit a — oo, we have a unitary theory, but one in which manifest 
renormalizability is lost. For this choice, called the unitary gauge, the spurious 
pole for the &;.2 field at aM? disappears from the theory, and we are only left with 
the physical fields propagating in the theory. 

In the limit a —» 0, the theory is renormalizable by power counting, but not 
manifestly unitary. For intermediate values of a, the poles in the propagator of 
the gauge fields A};* at k? = aM? cancel with the poles in the propagators of the 
&\.2 field. Thus, it is no contradiction to have a theory which is both unitary and 
renormalizable. 

The point of this exercise is to quantize the Weinberg—Salam model. The 
Higgs sector is given by: 


2 


a, — ig’ “Bub $ c'Wid| — mote — roy? (10.116) 


We will choose a parametrization that exchanges the complex doublet of 
Higgs mesons ¢), ¢2, which contains four separate fields, with the four real fields 
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Ej; eoeeas 1: 


+= d-4)(2,) = 


Ue papers oe (10.117) 
= —_ Uv soi . 
v2 1 —i&/v : 


Expanding out the action generates a large number of terms. However, the 
term in which we are interested is the cross term between the gauge and scalar 
part, which is given by: 


: 1 
— MpB,0"&3 + Mw W,,0"E; + 5 ubiO" si (10.118) 


where we sum over i = 1, 2 and Mg = g'v/2 and My = gv/2. 
We now choose the ’t Hooft gauge so that this cross term is cancelled: 


1 - a A 2 
Sor = 7 (0,W" —aMwé.) — oe (0, BY +aMpé) (10.119) 


where we sum over a = 1, 2. 


The cross terms cancel, and we are left with the mass terms: 


1 1 
— 50M (Et + 2) — 50Mz55 (10.120) 


This can be rewritten in terms of the physical fields: 


W; = —sindwA, +cosOwZ, 
B, = cosOwA, +sin6yZ, 
i 
= 1 2 
Wi oO ya = We) (10.121) 


Now we can write down the new action, and from it extract the Feynman rules 
for the propagator. The relevant terms are: 


1 : I ] 
& = —7(@.W, — dW, — GO. Br — a. Buy 
1 na 
— 5 (0*W, — Mw y- 55 uAYy 


1 
= 55 (Ou2" +aMz&) 
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1 
+ Mw + Wow) + 5MzZ,Z" ++++ (10.122) 


The propagators can now be read off from the action: 


qi ae akyky /(k? — aMy) 
“ue 


k? — Mj, + ie 
18uv 
Au k? +i€ 
> ., »o oe ak vk, /(k* — «M3) 
- k? — M2 +ie 
i 
fa aM, 
i 
= 22 (10.123) 


Again, we have a theory in which unitarity is manifest for a — oo but 
manifest renormalizability is lost. The fictitious poles vanish, and we are left with 
a theory defined only in terms of the physical fields. For a — 0, the theory is 
renormalizable, but not manifestly unitary. In general, the fictitious poles coming 
from the propagators of the &, 2.3 fields cancel against the poles coming from the 
propagators of the Z,, and Vee fields, giving us a theory that is both unitary and 
renormalizable. 


10.8 Coleman—Weinberg Mechanism 


Although spontaneous symmetry breaking lies at the heart of the Weinberg—Salam 
model, one of its weaknesses is the arbitrariness of the Higgs potential. This is a 
serious Criticism, since many of the physical parameters depend crucially on the 
precise form of the Higgs potential. 

In principle, one would like to derive the Higgs potential from more funda- 
mental principles, with as few arbitrary parameters as possible. 

One interesting approach is the Coleman—Weinberg method,!” where the Higgs 
potential is induced by radiative corrections, rather than being inserted by hand. 
In this approach, we sum over higher-loop graphs to induce an effective potential, 
which may then produce spontaneous symmetry breaking. 

Ideally, one would like to start with a theory that is massless from the very 
beginning and then induce the mass corrections appearing in the action by radiative 
corrections. This is called dimensional transmutation, where a dimensionless 
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le | 
Vio) ~ + + 


Figure 10.4. The sum over one loop graphs with an arbitrary number of ¢ external lines 
generates an effective potential, which in turn can induce, under certain circumstances, 
spontaneous symmetry breaking. 


theory trades one of its dimensionless coupling constants for a dimensionful one. 
This, in turn, gives us the hope of deriving all masses from first principles. 

To illustrate this procedure, let us first study the simplest possible example, 
the ¢* theory. Although this theory is too simple to give us a reliable mechanism 
to break gauge symmetries, this example reveals the basic principles. 

Let us begin with the usual ¢* theory, with the action: 


A 


= a (10.124) 


L = 50,8) — Smeg? 
(We will eventually take the limit as m — 0 at the end of the calculation.) Our 
task is to sum over an infinite series of one-loop graphs with an arbitrary number 
of $* vertices attached to it (Fig. 10.4). 

After this sum is performed, the net effect of this series is to generate a 
new, effective action where the potential is nonpolynomial. For example, let 
us use Feynman’s rules to give us the contribution to the single-loop potential. 
Each single-loop diagram is given by an integral over the internal momentum. 
Feynman’s rules give us the contribution to the single-loop graph with n vertex 
insertions as: 


2n)! f d*k i ¢ 
a On) (mea a) (10.125) 
where the symmetry factor must be inserted because there are (2n)! ways to 
distribute 27 particles among the external lines. (We have taken the external lines 
to have zero momentum. This will be justified later.) 

Now we would like to write down an effective action that generates this series 
at the tree level. To obtain this new, effective potential, we simply multiply the 
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term with n insertions by ¢7”, where ¢ from now on will represent the classical 
value of the field. We must also correct for symmetry factors and then sum. The 
effective potential now contains the term: 


if oe 3 (224.) (10.126) 


(20)  2n \k? — m? + ie 
This series, fortunately, can be easily summed by using the Taylor expansion for 


the function log(1 + x). Let us sum the series (which yields a divergent integral) 
and then perform the integration by putting in an explicit cutoff A: 


Ae. 2 ages rg?/2 
rth +5 f gepbs[t+ eo 


i 
= 
NS 
S- 
Ne 
wu 


Vise 


2 5 
] 5 ae m* +1d2/2 +ie aol 
+ Gan? (m ‘o- ) oe ( A? 5 (10.127) 


where we have used the summation: 


1 n 1 1 
k?—m? =k? —m?2 


1 1 
[a 10.128) 
k? — m? k? — (m2 + 62/2) ( : 
Since the original theory was renormalizable, this means that, with the addition 
of counterterms into the action, we can absorb all cutoff-dependent infinities into 
a renormalization of the parameters of the theory. 
We add the following counterterms to the action: 


A B 
Vers — Vege + so + ne (10.129) 


and then solve for parameters A, B by making the following definitions of the 
renormalized parameters: 


y d? Vere 

m = 
dq” |¢=m 
d* Vee 

SS i 
ae (10.130) 


(We have taken the condition that the classical field @ = M in order to avoid 
infrared divergences as we take the limit m — 0.) 
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Now let us insert this effective potential (with the counterterms added) and 
then use these conditions to determine A and B. For small m, the calculation is 
straightforward and yields: 


_ 1a, 
Vesf,r = ree Le Tyy 


1 eee 2 + 4g? /2 
gina| (m+ 30") toe (“S*) 


1 25 1 2m? 
Dag) 24 244 
mele ? —- 4 Ap’ + —A*p" log (10.131) 


+ 


AM? 


We now perform our last step, by taking the limit as m — 0: 


se Moe: g 25 
Vettr = GA + sya log — — — (10.132) 


This is our final result. On the surface, it appears that we have accomplished 
our goal: We have traded a potential (with no mass terms) with a new potential 
that has a new minimum away from the origin (where a new mass scale has been 
introduced by radiative corrections). 

However, this example has been too simple. The new minimum lies too far 
from the origin, beyond the reliability of the one-loop potential: 


2 

A log — = -x? + O(A) (10.133) 

The term on the left is greater than one, so that the loop contribution is larger 

than that of the tree contribution; so we are outside the region where the single-loop 

approximation is reliable. We have gained some insight into the use of radiative 

corrections to drive the minimum of the potential away from the origin, but our 
example has been too simple, with only one coupling constant. 

Next, we couple charged scalars to QED, where we now have enough coupling 

constants to make the Coleman—Weinberg mechanism work. We start with a new 

action, with two coupling constants e and A: 


1 1 OE Sa 
Zea =a fis (0, —ieA,)o| — ane oy (10.134) 


We choose the Landau gauge, where the propagator becomes: 


Suv — kyky/k? 


= (10.135) 


iA,v(k) = —-i 
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Figure 10.5. These are the only diagrams which contribute to the effective potential in the 


problem. 


We choose this gauge because k“A,,, = 0. In this gauge, the only diagrams that 
contribute to the effective potential are given in Figure 10.5. 

The graph in Figure 10.6 does not contribute to the action because k,, does not 
couple to the propagator in the Landau gauge. There are thus only three types of 
diagrams that have to be computed. 

We perform the calculation in the same way as before: 


1. We sum over each set of diagrams separately in the one-loop approximation 
by using the power expansion for log(1 +x). 


2. We use a cutoff to render the integrals finite. 

3. We introduce new parameters into the theory via counterterms. 

4. We calculate the value of these new parameters at the classical value of ¢ = M. 
All steps are exactly the same as before; the only difference comes from the 


value of the coupling constant contribution of each of the three diagrams. 
The result is given by: 


1 ¢ 2 25 
Vet: = ae" a = (ioe s — 3) C 
C= (JDO 29) 4 ce" (10.136) 


Figure 19.6. This graphs does not contribute because the momentum vector does not 
couple to the propagator. 
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[This result is almost identical to Eq. (10.132), except that the last factor of 3e* 
comes trom the trace of the Landau propagator, and the extra | /3 ratio between the 
{3 contribution and the ¢ contribution comes from different Wick expansion 
coefficients. ] 

We now make the assumption that 4 is of the order of e* (which we will 
show is self-consistent). Choosing the new minimum to be (@) = M, we find the 
potential to be: 


j 3e* gp as 
Ve r=—A 2 4 a ee 
ffir = Fy y a zo (108 ya =) (10.137) 


We know that the slope of the potential at the minimum is related to the mass, 
which we set to zero: 


dh ties 
vy’ =|. 2 ee 2 
((p) ( at) (¢)? =0 (10.138) 
Solving, we find: 
33 
A= — et 
Bn2° (10.139) 


Thus, our assumption 4 ~ e? is self-consistent. Moreover, we find a nontrivial 
constraint between two previously arbitrary coupling constants. We have traded 
the two dimensionless coupling constants e and A for a dimensionful parameter 
(p) and a dimensionless parameter ec. As we said earlier, this is an example of 
dimensional transmutation. At first, this might seem strange, because the original 
theory had no mass parameters at all, and yet a new mass parameter seems to have 
mysteriously entered into our theory. 

The origin of this new mass comes from renormalization theory. Even in scale- 
invariant theories with no mass parameters, renormalization theory introduces a 
mass parameter because we must perform the subtraction of divergent diagrams 
at some mass scale w. Changes in M simply involve a change in the definition 
of the coupling constant. (This forms the basis of renormalization group theory, 
which will be discussed in further detail in Chapter 14.) 

Finally, with this new value of 4, we can now write: 


eas ge 1 
Vete = bane? log (Ss — 3) (10.140) 


This is our final result, which shows that there is indeed a new minimum of the 
potential away from the origin, as claimed. We also mention that the generalization 
to gauge theory proceeds as expected, with little change. The only complication 
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is that new graphs are generated from the interaction Lagrangian that contains the 


coupling: 
o'(A, - A“) (10.141) 


As we said earlier, to illustrate this method we have made a number of as- 
sumptions; that is, the effective potential in Eq. (10.125) is defined at zero 
external momenta. Let us now justify this assumption from a more general point 
of view. To generalize our calculation, we will use the path integral method of 
effective actions. We recall that Z(/) is the generator of Green’s functions, and 
that W(J) = e’@ is the generator of connected graphs. In Section 8.4, we showed 
that a Legendre transformation produces the effective potential: 


I'(¢) = WW) — i d*x J(x)@(x) (10.142) 


We can power expand I as: 


a 
Pé)= oo f din...d's T"(x1, +++, nC) +++ GGn) (10.143) 


gl 


Each of the ["(x,..., x,) is the sum over all one-particle irreducible Feynman 
graphs. What we are interested in, however, is the effective potential Veg, which 
is defined by taking the position space expansion of I: 


r@)= f atx|-Vbr+ 500,020) +-- (10.144) 


where the term without any derivatives is defined to be the effective potential. 
To calculate a manageable expression for the effective potential, we will take the 
Fourier transform of [ : 


d*k, d*k 
[TAGs eee Se ee nae, Sie 
(x1 Xn) Qn)''** Qnyi (27 )"5"(ki +--+ +kn) 
sg ait tre Kase ke) (10.145) 


Now let us insert this equation for I'(@) as a power expansion in @. Inserting one 
into the other, we find: 


1 d‘k, — d*k 
T@) = = | d‘xy...d*x, a 
& 2 n! | Mee © EN Oreyh Ory 


x elthrnrt thea) TEGO, 0, «++, OD (x1) (2) ++ Plea) ++] 
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- if Le ~ [P"(0, 0, ..., O)p(x)” +--+] (10.146) 


In the last step, we have power expanded ['"(k;, ..., &,) and have taken only the 
lowest term. The higher-order terms contribute higher order derivatives of the 
fields, in which we are not interested. 

Now comes the key step, comparing this equation with the power expansion 
of [(¢) in terms of ¢. Comparing only the lowest-order term (which contains no 
derivatives of the field ¢), we can now extract the effective potential: 


1 
Vibe = — ) | —1(0, 0, ..., 0)6"(x) (10.147) 


n 


This is the desired expression. It simply says that the effect of summing over 
the loop expansion produces a series of Feynman diagrams with zero momenta 
TREGORO!. 38 0), such that they act as the effective potential for a new action. In 
this way, we can justify all the steps that we made earlier from more intuitive 
arguments. 

The ultimate use of the Coleman—Weinberg method, however, remains un- 
clear, especially since our accelerators have not been able to pin down the Higgs 
particle and its interactions, other than the fact that its mass must be greater than 
90 GeV. At the very least, we must use the Coleman—Weinberg mechanism to 
calculate radiative corrections to standard spontaneously broken theories to show 
that radiative corrections do not spoil the breakdown of symmetry. In other words, 
the Coleman—Weinberg mechanism can erase minima as well as create them in 
the potential. In this way, we find that the mechanism gives us bounds on the 
hypothetical mass of the Higgs particle. 

For the Weinberg—Salam model, a very straightforward summing of radiative 
corrections coming from the scalar, fermion, and vector meson loops gives the 
correction to the potential: 


V(¢) = Co" log(¢*/M’) (10.148) 


where: 
1 we 4 4 ) 4 


where V represents the sum over Z and W bosons. The value of v can be 
determined by solving for the minimum of the potential: 


aV 


——| = (10.150) 
A !g=0/V2 
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The mass of the Higgs field then becomes: 


10°V v2 3 
= = 2v?{04+C]}lo (sz) +5]| 10.151) 
m6 7 ag? Be 20 | EXIM 2 ( 


Putting everything together, we find: 


1 3a7(2 + sect Oy) 
Cyt = ——— YV 3m, = ———_—_ (10.152) 
mo =U = 6x04 » Y 16V2G ¢ sint Ow 
For the Weinberg—Salam model, we must have: 
mg > 7.9GeV (10.153) 


or else the radiative corrections will overwhelm the theory and destabilize the 
vacuum. This bound is easily met. 

Alternatively, one might postulate that the Higgs mechanism is driven entirely 
by radiative corrections. In this interesting case, we find: 


mg ~ 11 GeV (10.154) 


(which is experimentally ruled out). 

In closing, we should also mention that a broken symmetry may be restored 
under certain conditions. If we consider a ferromagnet, for example, we know 
that the Hamiltonian does not select out any preferred direction, but the vacuum 
state may consist of atoms that are all aligned. However, if we heat the magnet 
sufficiently, the spins become more disordered until a phase transition occurs. At 
even higher temperatures, the spin alignment is completely lost, and randomness 
is restored. 

Likewise, in a quantum field theory a spontaneously broken symmetry may 
also be restored if we place the system in a hot enough environment. This is called 
symmetry restoration. Aithough the temperature necessary to restore a broken 
symmetry is extraordinarily high, this is not an academic question. It may have 
great physical implications if we consider the temperatures found originally near 
the Big Bang. 

Perhaps, at the instant of the Big Bang, a unified theory of all known quan- 
tum forces possessed a symmetry large enough to include the strong, weak, and 
electromagnetic interactions and possibly even gravity. As the temperature of 
the universe rapidly cooled, the original symmetry broke down in several stages. 
The gravitational interactions first broke off from the particle interactions, then 
the GUT symmetry broke down into the SU(3) ® SU(2) ® U(1) symmetry of the 
Standard Model, then this group broke down into SU(3) ® U(1). 
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If this general picture is correct, then the study of symmetry restoration gives 
us a useful tool by which to probe the universe at early cosmological times. This 
is discussed in more detail in the exercises. 

In summary, we have seen how spontaneous symmetry breaking is perhaps 
the most elegant way in which symmetries can be broken. We retain all the 
symmetries of the theory in the Lagrangian, but the symmetry is broken via the 
vacuum state. In particular, spontaneous symmetry breaking allows us to generate 
a mass for the Yang—Mills theory without spoiling renormalizability. This was the 
crucial step in creating the Weinberg—Salam model, which successfully unites the 
electromagnetic interactions with the weak interactions. 

In the next chapter, we will discuss how the Yang—Mills theory also forms the 
basis of the strong interactions, giving us the possibility of splicing all quantum 
interactions into a single Standard Model. 


10.9 Exercises 


1. Write down the Lagrangian for a model of Higgs mesons with localO(N) 
symmetry, broken down to O(M) symmetry, for N > M, with the Higgs 
transforming in the vector representation. 


2. Do the same for a Lagrangian of Higgs with local SU(N) symmetry broken 
down to SU(M) symmetry (N > M), with the Higgs transforming in the 
fundamental representation. 


3. Derive the Feynman rules in the ’t Hooft gauge for Exercises | and 2. 


4, Calculate the Coleman—Weinberg potential for self-interacting O(N) mesons 
in the vector representation. 


5. Consider the two-dimensional Gross—Neveu model, with N massless fermions 
w°. The action is given by: 


pri gy? + ayy (10.155) 


(This contains a four-fermion interaction, which is nonrenormalizable in four 
dimensions.) By power counting, is this theory renormalizable in two dimen- 
sions? Why? Write down all possible divergent graphs. Show that the action 
is invariant under a discrete transformation: 


Wopyy, Wo -WyY (10.156) 
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6. Show that the Gross-Neveu Lagrangian can be rewritten as: 
7a ° N z 7.4 fa 
wid? — aaa +owy (10.157) 
§0 


where o is a scalar field. 


7. Examine the one-loop graphs in this theory with external o legs and an 
internal fermion loop. Calculate the effective potential V(o) for the o field 
by summing over one-loop graphs. Show that it equals: 


2n 
N 4, an ip ( =e 
—iV =—-—i—o* — — T —— | ———_ 10.158 
ae ie |e pi tie ( ) 
8. Show that the potential can be written as: 
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where we have taken a Euclidean integral and cut it off at momentum A. 


9. Define a renormalization mass M, defined by: 
= N-‘— (10.160) 


Solve for g, and write the potential as: 


ce De o2 ; 
— — — ——_ = 161 
v= E + ri oO (108 M2 3) | (10 ) 


Show that there is a minimum to this potential at a negative value, less 
than V(Q). Show that the theory has spontaneous symmetry breakdown, 
and that dimensional transmutation has occurred. Which parameter has been 
exchanged for which parameter? 


10. A four-dimensional precursor to the Gross—Neveu model is the Nambu—Jona - 
Lasinio model,!® with an interaction Lagrangian given by: 


sly —Wysv)] (10.162) 


Show that it is invariant under a global continuous transformation given by a 
chiral U(1) ® U(A). Is the theory renormalizable? Perform the same analysis 
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11. 


Ie. 


as betore. Add in an extra o field, and perform the integration over y; that is, 
sum the fermion bubble graphs. Show that chiral symmetry breaking occurs 
dynamically, that is, the fundamental action has no scalars, so the symmetry 
breaking occurs via a pseudoscalar bound state of the fermions. 


Add in a Maxwell term to make the Nambu—Jona-Lasinio action locally 
chirally invariant. Show that the massless particles are removed in this case. 


Although we have mainly discussed symmetry breaking, consider a model 
where a broken symmetry may be restored if we heat the system sufficiently. 
Consider a theory defined with potential V = }m?? + (A/4!)o4 for m? < 0 
with a Euclidean metric. Consider the finite-temperature Green’s function: 


a (10.163) 


Ga(x1, X2,°°'XNn) = 


B = 1/kT, where T is the temperature and k is the Boltzmann constant. Show 
that the one-loop correction to the potential is given by: 


fi Dee 
Via oe Ll oe ae oe =o mw’) (10.164) 


where M? = m* + 5)’ and where the theory is periodic in time and therefore 
has integral Fourier moments labelled by n. 


The sum in the previous problem diverges, so use the trick: 


1) = a: (eo = F*) 
— = 2s Gua (10.165) 
Using the fact that: 
Se = — x + 5 eoth zy (10.166) 
show that: 


au(E) _ Dt 
= 26 (5+ae—) 


26 [(E/2) +B! log(1 —e-**)] +... (10.167) 
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Then show that this implies: 


eas 
Ir: 


Vi [(Em/2) + B-| log’ — eP*™)] (10.168) 


where E2, =k? + M?. 
14. Show that, for high temperature (small f), the expression for the potential is: 


mr M2 1 MM 


i ~ 9084 2462 120 B 


ere 292, © _agd 6 92 
M* p* + ——=M" + O(M 10.169 
ET aes (M°B?) (10.169) 
where c ~ 5.41. Take the second derivative of V; with respect to gy. From 
this, show that the symmetry is restored when: 


2 
iz 2 -— (10.170) 


Calculate the order of magnitude of this temperature for the Weinberg—Salam 
model. At what temperature is the SU(2) @ U(1) symmetry restored? Can 
these temperatures be found on the earth, in a star, or in the early universe? 


15. Prove Eqs. (10.136) and (10.149). 


16. For superconductors, assume that there is an attractive force between electrons 
that forms Cooper pairs. Assume that this many-body system can be described 
by the Ginzburg—Landau action, which couples a ¢ field to Maxwell’s theory: 


eet 
Z= —7 Fv + D.@D"¢* — m?|6|? — ald|* (10.171) 


For small enough ternperature, spontaneous symmetry breaking occurs at the 
minimum ||? = —m?/2d > 0. Construct the conserved current j,,. For the 
static case, calculate the vector current j. Assume that @ varies slowly over 
the medium, then show that this implies London’s equation (i.e., i =), 
where k? = —em?/2h. 


17. By Ohm’s law, we have E = Rj. For the previous problem, using London’s 
equation, show that this means that the resistance is zero. Now take the curl 
of Ampere’s equation. Show that this implies: 


V7B=k’B (10.172) 
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so that (in one dimension) B- ~ e~**. This means that magnetic fields 
are expelled in a superconductor (Meissner effect), with penetration depth 
characterized by |/k. This implies that spontaneous symmetry breaking has 
made the Maxwell field massive, with mass k?. 


Chapter 11 
The Standard Model 


This was a great time to be a high-energy theorist, the period of the 
famous triumph of quantum field theory. And what a triumph it was, in 
the old sense of the word: a glorious victory parade, full of wonderful 
things brought back from far places to make the spectator gasp with awe 
and laugh with joy. 

—S. Coleman 


11.1 The Quark Model 


The Standard Model, based on the gauge group SU(3) ® SU(2) ® U(\), is one of 
the great successes of the gauge revolution. At present, the Standard Model can 
apparently describe all known fundamental forces (excluding gravity). 

The Standard Model is certainly not the final theory of particle interactions. It 
was created by crudely splicing the electroweak theory and the theory of quantum 
chromodynamics (QCD). It cannot explain the origin of the quark masses or the 
various coupling constants. The theory is rather unwieldy and inelegant. However, 
at present, it seems to be able to explain an enormous body of experimental data. 
Not only is it renormalizable, it can explain a vast number of results from all areas 
of particle physics, such as neutrino scattering experiments, hadronic sum rules, 
weak decays, current algebras, etc. In fact, there is no piece of experimental data 
that violates the Standard Model. 

In this chapter, we will discuss the Standard Model by first reviewing the 
experimental situation in the 1960s with regard to the quark model. Then we 
will present compelling evidence, from a wide variety of quarters, that the strong 
interactions can be described by QCD. Then we will marry QCD to the Weinberg— 
Salam model to produce the Standard Model. 
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Although the Standard Model makes the situation seem so clear today, back 
in the 1960s the experimental situation with the strong interactions was totally 
confused, with hundreds of “elementary particles” pouring out of our particle 
accelerators. J. Robert Oppenheimer, in exasperation, said that the Nobel Prize 
should be given to the physicist who did not discover a new particle. Although 
the Yukawa theory of strong interactions was fully renormalizable, the coupling 
constant of the strong interactions was large, and hence perturbation theory was 


unreliable: 
2 
Sa 4 (11.1) 


One important observation was that the existence of resonances usually in- 
dicated the presence of bound states, so Sakata! in the 1950s postulated that the 
hadrons could be considered to be composite states built out of p, n, and A 
particles. Then Ikeda, Ohnuki, and Ogawa? in 1959 made the suggestion that 
this triplet of particles transformed in the fundamental representation 3 of SU(3). 
They correctly said that the mesons could be built out bound states of 3 and 3: 


3@3=801 (11.2) 


However, several of their assignments were incorrect. 

In 1961, the correct SU(3) assignments were finally found by Gell-Mann>* 
and Ne’eman,° who postulated that the baryons and mesons could be arranged 
in what they called the Eightfold Way. Then Gell-Mann® and Zweig’ proposed 
that these SU(3) assignments could be generated if one postulated the existence 
of new constituents, called “quarks,” which transformed as a triplet 3. Since all 
representations of SU(N) can be generated by taking multiple products of the fun- 
damental representation, in this way we could generate all higher representations 
beginning with the quarks. 

The quarks belonged to the fundamental representation of SU(3): 


u 
3=q=| 4d (A753) 


where the quarks were called the “up,” “down,” and “strange” quark, for historical 
reasons. The u and the d quarks formed a standard SU(2) isodoublet, but the 
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addition of the third quark was necessary because it was observed in the 1950s 
that a new quantum number in addition to isospin was conserved by hadronic 
processes, called “strangeness.” This new quantum number could be explained 
in terms of SU(3), which is a rank 2 Lie group. Its representations are therefore 
labeled by two numbers, the third component of isospin 73 and also a new quantum 
number Y, called “hypercharge.” 

The new quantum number of strangeness and hypercharge could be related to 
each other via the Gell-Mann-Nishijima®? formula: 


- 0 O 
Y 1 
a ai ae (11.4) 
0 0 -t 


where Y = B +S. B is the baryon number, S is the strangeness number, 7; is the 
third component of isospin, and Q is the charge. 

To fit the known spectrum, the mesons were postulated to be composites of 
a quark and an antiquark, while the baryons were postulated to be composites of 
three quarks. Thus, we expect to see the mesons and baryons arranged according 
to the following tensor product decomposition: 


Meson = 383=861 
3@38@3=109089861 (11.5) 


Baryon 


The theory predicted that the mesons should be arranged in terms of octets and 
singlets, while baryons should be in octets as well as decuplets. The fact that this 
simple picture could arrange the known mesons and baryons in such an elegant 
picture was remarkable. 

In order to reproduce the known charges of the mesons and baryons, it was 
necessary to give the quarks fractional charges: 


Q.= 5e Qa= es =e (11.6). 


Since three of them were required to make up a single baryon, this meant that 
each of them had baryon number 4, We summarize the quantum numbers of the 
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quarks in the following chart: 


= 


u yy od 1 Gal 

Z 3 (11.7) 
af-4]4)-#) 4) °]3 
s 0}; 0 |-%]-1] 4 


For the meson spectrum, we can get a rough classification by considering the 
bound states generated by a gq pair. The bound states arrange themselves roughly 
in the following angular momentum series (similar to the familiar series found in 
spectroscopy using the notation 75+! L ;): 


So. (Sty. Pima Pi, (Po; Doee.. (11.8) 


The fit between experiment and the predicted bound states of the quark model 
was exceptionally good. For example, the octet containing the 2 meson and K 
meson corresponds to the 'Sp bound state, while the K* multiplet is part of a 3S, 
bound state: 


|ud) m*(140) | p*(770) 


2-'/2 dd — uit) | ©°(135) | °(770) 
ud) m~(140) | p~(770) 
2-'2\dd+uii) | (549) | w(783) 

(11.9) 
K*(494) | K**(892) 


K°(498) | K*°(892) 


K°(498) | K*9(892) 


) 
) 
iis) K~(494) | K*~(892) 
) 
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n'(958) | (1020) 
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Similarly, we can also analyze the baryons. The familiar proton and neutron 
belong to the octet, while the A resonance (found in pion—nucleon scattering) 
belongs to the decuplet: 


|uuu) 


|uud) 


\udd) 


\ddd) 


2-'/2\(ud + du)s) 


uus) 


\dds) 


———————————— 


2-'/2|\(ud — du)s) 


|uss) 
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A*(1230) 
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p(938) 


n(940) | A(1232) 


A (1234) 


*(1383) 


%*(1189) (11.10) 


©°(1384) 


©°(1192) 


See): || 225 (1387) 


A(1116) 


=°(1532) 


=°(1315) 


S (1321) | Sad335) 


Q~ (1672) 


To see how the bound states are constructed, it is sometimes useful to rearrange 
the meson and baryon matrices according to their quark wave functions. Let us 


define: 


q@q= 


(2uii — dd — s5)/3 


di 


SU 


(1/3)1(uit + dd +5) 


ud us 
(2dd — uit — s5)/3 ds 
sd (2s5 — uaz — dd)/3 


(11.11) 
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In terms of the familiar pseudoscalar mesons, we have the following arrangement 
of the meson matrix: 


gt + yen 5 al Nee 
M= a ke ke C2) 
x 0 — 
K K en 


Likewise, the baryon matrix can be arranged as: 


1_y0 a) + 
B-7©qeq— a =A tee n (11.13) 
ee = —+2A° 


V6 


To actually perform any calculations with SU(3), we need, of course, an 
explicit representation of the generators of the algebra. We will choose the 
standard Gell-Mann representation of §U (3) generators in terms of 3 x 3 Hermitian 
matrices: 


010 Ome 1 0 0 
eae) lege i ee — | Cees 
000 0 ow 0 6 © 
Oome a0. = 00 0 
OMe Me) a ee i ee = | Ome 
100 ela 010 
00 0 , (1 oe 
ig = | 0 0 Se — Se (11.14) 
0 ov eo 2G) oF = 
where: 
Tru) = ey 
Xi Aj =» XE 
Ee = ifn (11.15) 


The structure constants are given by: 


Nir 


fizz = 13 fiaa = —fise = foss = fos7 = faas = —fae7 = 
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fass = forw= (11.16) 


IS 


From these commutation relations, we can work out the representations of the 
group (see Appendix). Because $U(3) is a rank 2 Lie group, we can chart 
the various representations of the group in a two-dimensional space, plotting 
the eigenvalues of A; of the SU(2) subgroup against Ag, which is proportional 
to the hypercharge. Thus, on a two-dimensional graph (isospin plotted against 
hypercharge) we can pictorially represent the triplet, antitriplet, octet, decuplet, 
etc. (Fig. 11.1) 

Although hadronic masses are not exactly SU(3) invariant (i.e., the masses 
of particles within a multiplet vary slightly), it is reasonable to assume that the 
terms that break SU(3) symmetry should themselves transform covariantly under 
SU(3). We assume, for example, that the mass term in the Hamiltonian includes 
a term that breaks the symmetry transforms as Ag, as hypercharge. We assume the 
mass term has the form: 


Wlatbrgyt... (11.17) 


This, in turn, gives us nontrivial! relations between the masses of the various 
particles within a multiplet called the Gell-Mann—Okubo"® mass relation, which 
provided experimental verification of the theory. It gives us the mass relation: 


1 
mn +mz = 5(mr + 3ma) (11.18) 


which agrees well with experiment. The left-hand side equals 2.25 GeV expeni- 
mentally, while the right-hand side equals 2.23 GeV. 
For the spin-3/2* decuplet, we also find the equal-spacing rule: 


Mo — Mex = Me — Myx = Myx — Mn* (11.19) 


The experimental mass differences are 139, 149, and 152 MeV, respectively. 

Historically, the prediction of the {2~ mass from this formula gave a boost to 
the wide acceptance of $U(3) symmetry. 

Because of the success of the SU(3) quark model, attempts were made to 
generalize this to larger groups. One attempt, merging SU(3) with the SU(2) 
of spin to create SU(6),''—'3, tried to mix an internal symmetry with a space 
symmetry. This was possible because SU(2)® SU(3) C SU(6). SU(6) had some 
success in predicting the magnetic moments of baryons. [However, attempts 
to generalize SU(6) to the relativistic case floundered because of the Coleman- 
Mandula theorem. } 

Attempts were also made to generalize SU(3) to SU(4)" and beyond by adding 
more quarks. This approach received experimental vindication with the discovery 
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Figure 11.1. When we plot isospin against hypercharge, we can represent the triplet, 
antitriplet, octet, decuplet, and higher multiplets in simple geometrical patterns. 
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Figure 11.2. With charm plotted on the vertical axis, the quark model gives an excellent 
fit to the charmed meson multiplet. 


of the charmed quark c in 1974 and the bottom quark b in 1977. [However, 
because the masses of the charmed and bottom quark are so large, global SU(4) 
and S$U(5) are less reliable than $U(2) and SU(3) experimentally. ] 

For the charmed quark system, the new qq and qqq states are given the 
following group-theoretical assignments: 


4@4 1501 
4244 = 46209020020 (11.20) 


The charmed quark bound states are given the following names for the O~ and 
1~ multiplets (Fig. 11.2): 
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n-(2980) | J/%(3097) 


D*(1869) | (D*)*(2010) 
D°(1865) | (D*)°(2007) 


CHiZ1) 
D°(1865) | (D*)°(2007) 


D~(1869) | (D*)~(2010) 
D+(1970) | D**(2109) 


Dz=(1970) | D*~(2109) 


We can also write down an explicit form for the generators of the SU (4) global 
symmetry. In fact, it is possible to find a simple iterative algorithm to write down 
the generators of SU(N) almost by inspection. We first notice that we can write 
the first three 4; matrices as follows: 


on 0 
= é : =ieo 1e22 
Xa ( 0 0 i a ? 5 ( ) 


where this symbolically means that the Pauli spin matrices are placed in the upper 
left-hand corner for a = 1, 2, 3. 

Next, A4_7 obey a simple pattern. Along the right column and bottom row, 
we insert the numbers | and | (as well as —i and 7) symmetrically in all possible 
slots. Finally, the generator Ag has the unit 2 x 2 matrix in the upper left-hand 
corner and we choose the last number along the diagonal to make it traceless. 

From this algorithm, we can easily write down the generators of SU(N) if we 
know the generators of SU(N — 1). For example, we can now write down the 
generators of SU(4) almost by inspection. To see this, we place the generators of 
SU(3) in the upper left-hand corner for a = 1 — 8: 


Aq O 
a ‘ 2 St I oan : 
( 0 a oi || 8 (123) 


For the generators A9_14, we place the pairs of numbers 1, 1 and —i, i sym- 
metrically in the right column and bottom row, while for the last generator A,5 we 
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put the unit 3 x 3 matrix in the upper left-hand side and make it traceless: 
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Similarly, the generators of SU(S) can be constructed in this way. Although less 
is known about the quark spectrum for mesons containing the bottom quark, all 
states discovered so far obey the quark model predictions. The lowest lying states 
include: 


1(9460) 
B*(5278) | (B*)*(5324) 
B°(5278) | (B*)°(5324) 


B°(5278) | (B*)°(5324) 


B~ (5278) | (B*)~ (5324) 


(The 7, has not been firmly established, and the charges of the B* are not yet 
confirmed.) 

Today, the original three quarks have been expanded to six quarks: the up, 
down, strange, charmed, bottom, and top quark. All but the last have been 
discovered. The global symmetry group SU(N) for N quarks is now called the 
“flavor” symmetry. 
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The rough values for the constituent quark masses in GeV are given by: 


mp = Oad, m. = 5 
my => 0:35; oe ClIEZD) 
m, = 0.5; m > 91 


These quarks, in turn, can be arranged in three identical families or generations, 
each having the same quantum numbers: (u, d), (c, 5), and (t, b). (The reason 
why nature should prefer three identical generations of quarks and leptons is one 
of the great mysteries of subatomic physics.) 

Historically, although the quark model had great success in bringing order out 
of the chaos of the hundreds of resonances found in scattering experiments, it also 
raised a host of other problems. In fact, each year, even as the successes of the 
quark model began to pile up, the questions raised by the quark model also began 
to proliferate. For example, why were the quarks not observed experimentally? 
Were they real, or were they just a useful mathematical device? And what was 
the binding force that held the quarks together? For example, some believed 
that the glue that held the quarks together might be a vector meson; however, 
to be renormalizable, it had to be massless. But this was impossible, because 
if it was massless, then it should generate a long-range force, like gravity and 
electromagnetism, rather than being a short-range force like the strong force. 


11.2 QCD 


After years of confusion, the theory that has emerged to give us the best under- 
standing of the strong interactions is called QCD, which has the Lagrangian: 


6 
Z= ican D— mv; (11.26) 
where the Yang—Mills field is massless and carries the SU(3) “color” force [not 
to be confused with the global SU (3) flavor symmetry introduced earlier]. Unlike 
the electroweak theory, where the gauge group is broken and the Z and W become 
massive, the color group is unbroken and the gluons remain massless. 

The quarks have two indices. The i index is taken over the flavors, which 
labels the up, down, strange, charm, top, and bottom quarks. The flavor index is 
not gauged; it represents a global symmetry. However, the quarks also carry the 
important local color $U(3) index (which is suppressed here). In other words, 
quarks come in six flavors and three colors, but only the color index participates 
in the local gauge symmetry. From the point of view of QCD, the flavor index, 
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which dominated most of the phenomenology of the 1960s and 1970s, is now 
relegated to a relatively minor role compared to the color force, which binds the 
quarks together. 

In fact, from the perspective of QCD, we can see the origin of the early 
phenomenological successes of global SU(3). In the limit of equal quark masses, 
the QCD action possesses an additional symmetry, global SU(N) symmetry for 
N flavors of quarks. For the u and d quarks, this is a very good approximation; 
so SU(2) global symmetry is experimentally seen in the hadron spectrum. The 
s quark mass, although larger, is still relatively close to the u and d mass when 
compared to the baryon mass; so we expect flavor $U(3) symmetry to be a 
relatively good one. However, the masses of the c and b are much larger; so we 
expect SU(4) or SU(5) flavor symmetry breaking to be quite large. The higher 
flavor symmetry groups are hence less useful phenomenologically. 

Although quarks have never been seen in the laboratory, there is now an 
overwhelming body of data supporting the claim that QCD is the leading theory 
of the strong interactions. This large body of theoretical and experimental results 
and data, accumulated slowly and painfully over the past several decades, can be 
summarized in the following sections. 


11.2.1 Spin-Statistics Problem 


According to the spin-statistics theorem, a fermion must necessarily be totally 
antisymmetric with respect to the interchange of the quantum numbers of its 
constituents. One long-standing problem, however, was that certain baryon states, 
such as the 10 representation, which includes the A** resonance, were purely 
symmetric under this interchange, violating the spin-statistics theorem. 

For example, the wave function for this resonance is naively given by: 


Va = Psu 3) WVordital Vspin CiT-27) 


This wave function is symmetric under the interchange of any two quarks, which 
is typical of bosonic, not fermionic, states. To see this, notice that the SU (3) flavor 
part of the wave function is symmetric, since it is composed of three u quarks, all 
pointing in the same direction in isospin space, as in (11.10). Also, since the spin 
of this resonance is 3/2, all three quark spins are pointing spatially in the same 
direction, so the spin wave function is also symmetric. Finally, the interchange of 
the quarks in the orbital part yields a factor (— 1)/, which is one because L = 0 for 
this resonance. Thus, under an interchange of any two quarks, the wave function 
picks up a factor of (+1)(+1)(+1) = 1, so the overall wave function is symmetric 
under the interchange of the quarks, which therefore violates the spin-statistics 
theorem for fermions. 
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Since physicists were reluctant to abandon the spin-statistics theorem, one 
resolution of this problem was to postulate the existence of yet another isospin 
symmetry, a new “color” symmetry, so that the final state could be fully antisym- 
metric. This was the original motivation behind the Han—Nambu'> quark model, 
a precursor of QCD. 


11.2.2. Pair Annihilation 


The simplest way experimentally to determine the nature of this mysterious new 
color symmetry is to perform electron—antielectron collision experiments. Pair 
annihilation creates an off-shell photon, which then decays into various possible 
combinations. We are interested in the process: 


e’+e — y—G+q — hadrons (11.28) 


This process is highly sensitive to the number of quarks and their charges that 
appear in the calculation. Using Feynman’s rules, we find that the cross section 
must be proportional to the number of quark colors times the sum of the squares 
of the quark charges. In practice, it is convenient to divide by the leptonic 
contribution to the cross section: 


ete > utp (11.29) 


By taking the ratio of these two cross sections, we should therefore find the pure 
contribution of the quark color sector. In particular, we find: 


o(e*e~ — hadrons) 2 
Re eee 11.30 
a(ete” > wre) bz £ a 


where N,. is the number of quark colors and Q; is the charge of each quark. 

For low energies, when we excite just the u, d, s quarks, we expect R to equal 
3(4 + 1+ 1)/9 = 2. When we hit the threshold for creating charm~—anticharm 
intermediate states, then this ratio rises to over 4. If we include the u,d,s,c,b 
quarks, then we have R = 11/3. Experimentally, this agrees rather well with 
experiment, assuming that there are three colors (Fig. 11.3). 


112.3 Jets 


High-energy scattering experiments should be able to knock individual gluons 
and quarks out of the nucleus. Although they quickly reform into bound states 
and hence cannot be isolated, they should make a characteristic multiprong event 
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P, @, > YY 
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Figure 11.3. The plot of R against energy, which agrees with the value of R found in QCD. 


in the scattering apparatus. These multiprong events, as predicted, have been 
produced in high-energy scattering experiments. 

For exampie, two-prong jets have a characteristic distribution dependence 
given by (1 + cos* 6), where @ is the angle between the beam and the jets. This is 
consistent with the process e~ + e* — q +@ for spin 5 quarks, as expected. This 
important topic will be discussed in the next section. 


11.2.4 Absence of Exotics 


Although the quark model had great success in fitting the known hadrons into 
33 and 3 @3 3 bound states, it was at a loss to explain why exotic states, such 
as 3 @ 3, etc., should not be formed as well. Because the original quark model 
gave no indication of what the binding force was, this question could never be 
answered within the context of the old quark model. 

QCD, however, gives a simple reason why these exotic states are absent. We 
learned earlier in Section 9.4 that the states of the unbroken Yang—Mills theory are 
singlets under the gauge group. We notice immediately that gq and qqq states are 
invariant under the color group because they are contracted by constant invariant 
tensors, like the delta function and the structure constant fj;,. Low-lying exotic 
states, because they are nonsinglets under the color group, are either absent or will 
decay into the usual bound states. (Later, when we discuss lattice gauge theory, 
we will see that QCD can, in principle, give us numerical results to back up this 
heuristic result.) 
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11.2.5 Pion Decay 


The Feynman diagram for the decay x — 2y consists of an internal quark triangle 
loop, with the pion and the gamma rays attached to the three corners of the triangle. 
Thus, this decay rate is proportional to the sum over all the quarks that occur in 
this internal triangle loop. By comparing the experimental decay rate of the pion 
into two gamma rays, we can therefore calculate the number of quark colors. The 
experimental evidence supports the presence of 3.01 + 0.08 colors. 


11.2.6 Asymptotic Freedom 


Historically, it was the discovery of asymptotic freedom that elevated QCD into 
the leading theory of the strong interactions. Deep inelastic experiments, such as 
e+ p — e+anything, showed that the cross sections exhibited scale invariance 
at high energies; that is, the form factors lost their dependence on certain mass 
parameters at high energies. This scale invariance could be interpreted to mean 
that the quark constituents acted as if they were free particles at extremely high 
energies. 

QCD offers a simple explanation of this scale invariance. Using the renormal- 
ization group, which will be discussed at length in Chapter 14, one could show 
that the coupling constant became smaller at high energies, which could explain 
the reason why the quarks behaved as if they were free. Asymptotic freedom gave 
a simple reason why the naive quark model, which described complex scattering 
experiments with free quarks fields, had such phenomenological success. 


11.2.7 Confinement 


The renormalization group also showed the converse, that the coupling constant 
should become large at low energies, suggesting that the quarks were perma- 
nently bound inside a hadron. This gave perhaps the most convincing theoretical 
justification that the quarks should be permanently “confined” inside the bound 
states. Although a rigorous proof that the Yang—Mills theory confines the quarks 
and gluons has still not been found, the renormalization group approach gives us 
a compelling theoretical argument that the coupling constant is large enough at 
small energies to confine the quarks and gluons. If correct, this approach also 
explains why the massless gluon field does not result in a long-range force, like 
gravity or the electromagnetic force. Although the range of a massless field is 
formally infinite, the gluon field apparently “condenses” into a stringlike glue 
that binds the quarks together at the ends. [In Chapter 15, we will present some 
compelling (but not rigorous) numerical justification for this picture using lattice 
gauge theory. ] 
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Thus, with one theory, we are able to interpret two divergent facts, that quarks 
appear to be confined at low energies but act as if they are free particles at high 
energy. 

The phenomenological success of the quark model, of course, is not exclu- 
sively confined to the strong interactions. Quarks also participate in the weak 
interactions, and have greatly clarified the origins of certain phenomenological 
models proposed in the 1960s (such as current algebras). By studying the weak 
currents generated by the quarks, one can find a simple quark model explana- 
tion for a number of phenomenologically important results from weak interaction 
physics, such as are given in the following sections. 


11.2.8 Chiral Symmetry 


In the limit that the quarks have vanishingly small mass, the QCD action possesses 
yet another global flavor symmetry, chiral symmetry. For N flavors, the QCD 
action for massless quarks is invariant under chiral SU(N) @ SU(N), generated 
by: 
Glee 4g, GQ e"hg (11.31) 
[Actually, the QCD action has the additional chiral symmetry U(1) ® U(1) 
if we drop the 47 in the previous expression. The first U(1) symmetry gives the 
usual baryon number conservation. The second chiral U(1) symmetry will be 
studied in more detail in the next chapter.] 
Since the u, d, and s quarks have relatively small mass when compared with 
the scale of the strong interactions, then QCD has a global chiral symmetry in this 


approximation: 
my, ~ mq ~ ms; ~ 0 > SU(3) ® SU(3) (11.32) 


The SU(3) ® SU(3) chiral symmetry of QCD, in turn, allows us to compute a 
large number of relations and sum rules between different physical processes. 

Since chiral symmetry is broken in nature, for small quark masses we can, 
in fact, view the 2 meson as the Goldstone boson for broken chiral symmetry. 
The fact that the 7 meson has an exceptionally light mass is a good indicator of 
the validity of chiral symmetry as an approximate symmetry. A large number of 
successful sum rules for the hadronic weak current, as we shall see, can be wnitten 
down as a byproduct of the smallness of the pion mass. 
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Figure 11.4. Typical Feynman diagrams describing the collision of electrons and positrons, 
creating jets. 


11.2.9 No Anomalies 


The theory of leptons given by the Weinberg—Salam model in Chapter 10 is actu- 
ally fatally flawed by the presence of something called “anomalies,” which will 
be discussed in more detail in the next chapter. These anomalies sometimes arise 
when a classical symmetry of an action does not survive the process of quantiza- 
tion. In particular, there are certain divergent fermionic triangle graphs that can 
potentially destroy the Ward—Takahashi identities and hence ruin renormalizabil- 
ity. However, when the quarks are inserted into the Weinberg—Salam model, they 
also produce anomalies, but of the opposite sign. In fact, the charge assignments 
of the quarks and leptons in the Standard Model are precisely the ones that cancel 
the anomaly. The vanishing of the lepton anomalies against the quark anomalies 
in the Weinberg—Salam model can be seen as one more theoretical justification of 
the Standard Model. 


11.3 Jets 


Now that we have completed a broad overview of the theoretical and experimental 
successes of the Standard Model, let us now focus in detail on some of the specifics. 
One of the most graphic reasons for supporting QCD is the existence of “jets” in 
electron—positron collisions. In Figure 11.4, we see some of the typical Feynman 
diagrams that arise when electrons collide with positrons. 

The momentum transfer is so large that the quarks, antiquarks, and gluons in 
the final states are scattered in different directions. They later regroup into standard 
hadrons to form a jet-like structure, which has been found experimentally. 
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We will analyze the process with the following labeling of momenta: 


e (q)+e*(q') > q(p) + Gp’) + gk) (11.33) 
with O=q+q'=p+p'+kandS = Q?. 


For the two-jet event (where we drop the final gluon), we can factorize the 
transition matrix into a leptonic and a hadronic part: 


|. a) = ailww™ (11.34) 


where: 


1 — —\* 
ly = 7 2 OlJule*e )(O| J, |e"e~) 


spins 


1 qd. 
= = If —YW— 
4 (usby £) 


2 
e 
= pe (9) +09, — 8097’) (11.35) 


The hadronic part, to lowest order, can also be calculated in the same way: 


2 
ery 


Muy = $5 


(Pup ate PvP ip — SuvP° p’) (11.36) 


where e¢ is the electric charge of the quark flavor index f. Then we have: 


puree 
Al = Zeal + cos? 8) (11.37) 


where 6 is the center-of-mass scattering angle, so the differential and total cross 
sections for the two-jet process are given by: 


d : 2 
(a). = a (1 + cos? 6) eS (=) 


Oo = SL yy Ay (11.38) 


This (1 +cos” 6) dependence on the angle @ has actually been seen experimentally 
in two-jet processes, strengthening our belief in spin-5 quarks. 
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For the three-jet event the calculation is a bit more difficult because of the 
kinematics. The only complication is the hadronic part: 


Hw = >> (aaelJul0)(948|J010)* 


Sea aGeny el 
= ee? | o( + Dy 
8 Po Poko Fy f p-k e 


| 
— ng 1) 


«I (n GH DF = <W+by)] (11.39) 
where ¢ is the gluon polarization. This then becomes: 
Ayy = —_ ye e78 ° 
PoPoko 
(2 (2[P, Puy +(k, p+ P'Iuv) 
+ 5 (P+ Kav ~(, Pho) 


! / / {i 
+ TEE ((p, p’ + kl» —[p’, p I») ) (11.40) 


where we have defined [p, q]uv = Pugv + QuPv — 8uvp-q. Then the transition 
element: 


1 \2 
Lal = (za) lay H™ (11.41) 
becomes: 
a 1 2 
Ml = Tee poppe ere) 


U 


P-p a? ee 
= (Sper pe Oe -q')+2(p-q'\(p'-q) 
— 2k gtk -q')+(Q-qXQ-1)| 


1 
+ Park -aX(Q - p’) —2(p- q\(p - q’) =. 2(p’ . q\(p' . q’)| 
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1 
+ [MO a: p) - 2p -a\p-4')— 210" gio" -4’)]) 
(11.42) 


In practice, it is difficult to tell which jet emerged from which constituent. 
Therefore, let us number the jets. Let p;,, represent the momentum of the ith jet. 
We define the variables: 


x; = pio/E (11.43) 


where p3 = k. Let us also choose the center-of-mass frame, so that: g = (E, q) 
and g’ = (E, —q). 
Then the kinematics gives us: 


Pe) +x3=2 (11.44) 


The differential cross section is then given by: 


*d? p d°p' d°k 84 (p+ p'+k—Q) (11.45) 


8 - 


Inserting the value of .Z into the cross section, we get: 


2a* 9” er\? 
do = sp44n ra) 


Be (ja er) (11.46) 
G— xp 


The experimental data is sometimes analyzed in terms of a quantity called 
“thrust,” which can be defined as: 


/ 
p= 2% Pion (11.47) 
» i |Pil 

where the prime in the numerator means we sum over all particles in just one 
hemisphere. The thrust is a good variable because it varies from T = 5 (isotropy) 
to T = 1 (perfect jet behavior). In Figure 11.5, we see some typical three-jet 
events found at PETRA compared to the theoretical prediction. The agreement 
with the experimental data is excellent. 
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Figure 11.5. Experimentally, jets are found in collision processes, confirming the predic- 
tion of QCD (solid line). 


11.4 Current Algebra 


In the 1960s, before the rise of gauge theories and the Weinberg—Salam model, 
the original four-fermion Fermi action was a useful phenomenological guide that 
could account for some of the qualitative features of the weak interactions. 

The hadrons participate in the weak interactions. For example, the hadrons are 
mostly unstable and decay via the weak interactions. The 6 decay of the neutron 
is the classic example of the weak interaction of the hadrons. Mimicking the 
success of the simple Fermi action, attempts were made to describe the hadronic 
weak interactions by postulating that the action was the product of two currents. 
The phenomenology of the weak interactions in the 1960s was dominated by 
something called current algebra,'*'’ which postulated the commutation relations 
of the currents among themselves. By taking their matrix elements, one could 
therefore derive sum rules that linked different physical processes. Although 
there was no understanding why the effective action should be the product of two 
currents, or why they should obey a chiral algebra, these current algebra relations 
agreed rather well with experiment. From the perspective of the Standard Model, 
however, we can see the simple origin of the current algebras. 

To be specific, let us analyze the weak interactions of the quarks. We will 
insert quarks into the Weinberg—Salam model, arranging them into standard $U(2) 
doublets and U(1) singlets, which gives us the Glashow—Weinberg—Salam model. 
We now pair off the three generations of quarks u, d, c, s, t, and b with the three 
generations of leptons: 


Ce (De 69,6), 
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Notice that we are suppressing the color index on the quarks. From the point of 
view of the weak interactions, the color force is not important. 

As with the electron, the quarks also have right-handed SU(2) singlets: 
UR, aR, CR, SR, tr, br, which are necessary when we construct mass eigenstates 
for the quarks. (We will describe the question of mixing between these various 
generations shortly.) 

The Lagrangian for the Standard Model then consists of three parts: 


ZL Standard Model = ws (lept.) + Aws(had.) + Facp (11.49) 


where W-S stands for the Weinberg—Salam model, and lept. and had. stand for 
the leptons and quarks that are inserted into the Weinberg—Salam model with the 
correct SU(2) ® U(1) assignments. (We assume that both the leptons and hadrons 
couple to the same Higgs field in the usual way. We also ignore quark mixing 
here.) 

In order to get the correct quantum numbers, such as the charge, we must 
choose the following covariant derivatives for the left-handed and right-handed 
quarks in 4ws(had.): 


o.( : - EB -i(2)o'w; ei (£) B, | ( ‘ (11.50) 
L L 


and: 


il 


[3 — i(2g'/3)B,| ur 
(a, + i(g’/3)B,| dr (11.51) 


D,dr 


Since the quarks have different charges than the leptons, the coefficients appearing 
in the covariant derivatives are different from Eq. (10.63), in order to reproduce 
the correct coupling of the photon A,, to the quarks. 

From this form of the Standard Model action, several important conclusions 
can be drawn. First, the gluons from QCD only interact with the quarks, not 
the leptons. Thus, discrete symmetries like parity are conserved for the strong 
interactions. Second, the chiral SU(N) © SU(N) symmetry, which is respected 
by the QCD action in the limit of vanishing quark masses, is violated by the weak 
interactions. Third, quarks interact with the leptons via the exchange of W and 
Z vector mesons. Since the action couples two fermions and one vector meson 
together, the exchange of a W or Z meson couples four fermionic fields together. 
As in the previous chapter, one can therefore write down a phenomenological 
action involving four fermions, similar to the original Fermi action. In the limit of 
large vector meson mass, the Standard Model action can therefore be written as the 
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sum of four-fermion terms that link quarks and leptons together. The important 


observation is that the charged W and neutral Z mesons couple to the currents, so 
the effective four-fermion action is the sum of the products of two currents: 


Gr 
Beg = Ea (11.52) 
e V2 a 


where we sum over charged and neutral currents, and where the charged weak 
current is the sum of a leptonic and a hadronic part: 


J¢s= Jie ane (12253) 
The leptonic part is given by: 
Jeo = Wey" (1 — ys)tr. + Puy (1 — ys)Wr, (11.54) 


and the hadronic part is given by a vector and axial-vector piece: 
Inca = V" — A® (155) 


where the vector and axial vector currents are made out of quark fields. (There are 
corresponding expressions for the neutral current, which is mediated by Z meson 
exchange and is diagonal in the lepton fields.) If we neglect strangeness, then the 
charged hadronic weak current can be written as: 


00 0 
Ufa dy*(l—ysu=(id 5)y,1—ys)} 1 0 O d 
00 0 


= ga'-?y (1 — ys)q/2 (11.56) 


In other words, it transforms as the 1 + i2 component of a SU(2) triplet. 

As before, the effective coupling constant Gr, because it is generated by the 
exchange of W and Z mesons, is related to the vector meson coupling constant; 
that is, Gr/V/2 ~ g?/M3,. 

Because the weak current is the sum of a leptonic and hadronic part, and 
because the effective action is a product of two such currents, the effective weak 
action can be broken up into three pieces, generating hadronic—hadronic, hadronic— 
leptonic, and leptonic—leptonic interactions: 


G 
Beg = 3 > {iept + Fnaa}!, { Aept + Jaa} (11.57) 
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where we sum over both charged and neutral currents. The Standard Model effec- 
tive action can therefore describe, in principle, all possible interactions between 
subatomic particles. 

In the previous chapter, we discussed the leptonic—leptonic weak interactions. 
In this section, we will discuss the hadronic—hadronic weak interactions induced 
by the hadronic part of the weak current. Then in a later section, we will discuss 
the hadronic—leptonic interactions, which can mediate the weak decay of hadrons. 

The current algebra relations can also be easily deduced from the Standard 
Model. The vector and axial vector currents generated by chiral SU(N) @ SU(N) 
symmetry of the quark model are: 


Vi= Guid q/2, Al =Gypysia/2 (11.58) 


where A° are the generators of SU(N), and q transforms in the fundamental 
representation of SU(N). For the free quark model, it is then easy to show that 
these SU(N) © SU(N) currents generate a closed algebra, which forms the basis 
of current algebra. 

To see this more explicitly, we note that our action is invariant under: 


60; = =e (t )i:@; (11.59) 
(for generality, ; can be either fermionic or bosonic). 


From our previous discussion, we know that, for every symmetry of the action, 
we have a conserved charge: 


O%(t) = / Tex) dx (11.60) 
where the current is: 
P  b16F 
ttt sang, uP (11.61) 


We know that the conjugate field is given by: 


6.2 
60°¢; 


1j(x) = Cio) 


with canonical commutation (anticommutation) relations: 


[70;(x, t), Cy, t)] = —i5°(x — y)bij (11.63) 


388 The Standard Model 


Assuming that the current is composed out of free fields, we can take the 
commutator of two charges: 


(a(t), O°(t)) - fas dy [mj (x, t)(t7)ij6j(, t), Tey, tyr udily, t)] 


| 


- / Bx By mi(x, Ne); [0)%, 0), ty, 0) adY. 9 


+ my, (CPt (iC, 1), HY, DV ()i7 4), | 


- / dx {my(x, tilt’, tli; (x, 1} (11.64) 
Finally, we find: 
[O(1), O°(t)] = if” O°) (11.65) 
where the derivation works equally well for commutators with bosons and anti- 
commutators with fermions. 


Let us assume that we have, in addition to the usual symmetry, an axial 
symmetry, generated by: 


(Od (l= / d(x) dx (11.66) 


Then it is easy to show that the vector and axial vector currents generate the 
algebra: 


[O"@eor@) = iF ese) 
[O° 02 @) “= 170s) 
1O“@),O" Oil = if ea) (11.67) 


For a unitary group, this generates the group SU(N) ® SU(N). To see this, we 
redefine: 


1 
Oh. =O 0) 
] 
Orn = oro) (11.68) 


These generators have the commutation relations: 


([OXOFO, Ol = af" 07) 
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0 (11.69) 


[O%(t), O2(t)] 


LOZ), OR(t)] 
We can also show that the unintegrated currents satisfy: 
KOREA Spaces) (11.70) 
as well as: 
[Jo(x, 1), Joly, ] = if 7° Jp (x, 1)8°(x — y) (11.71) 


For our purposes, we are interested in the current algebra that generates 
SU(N) @ SU(N) from Eq. (11.55): 


[Vi(x, 0), Vely.t)) = ifabe VE (x, t)8°(x — y) 


It 


[Vii(x, t), Ably, 2] i fabe AG(X, t)5°(x — y) 


i fabe Vé (x, t)8°(x — y) Gleg2) 


[AG(x, 1), Ag(y, 1) 


We should be careful to state, however, that the current algebra for the other 
components of J/,, do not form such a simple algebra. For example, one can show 
that the commutator between a current Jo and J; does not close properly: 


[72@,t), 29,0] = if ixees a y) 


a 
+ a: —y) (11.73) 


where the last term is called a Schwinger term. Very general arguments show that 
such a term must necessarily exist in the algebra. In our discussion of current 
algebra, we must be aware of the presence of these Schwinger terms. 


11.5 PCAC and the Adler—Weisberger Relation 


By analyzing the properties of these weak currents, we can derive a large body of 
relations between different physical processes, which agree remarkably well with 
experiment. In this section, we will study the Conserved Vector Current (CVC) 
hypothesis, the Partially Conserved Axial Current (PCAC) hypothesis, and the 
Adler—Weisberger relation. 
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Although these relations were originally derived from the effective action and 
current algebra, we can see that they are all rather simple consequences of the 
Standard Model. 


Ifed CVE 


To see the origin of the CVC hypothesis,'* notice that the muon decay constant 
G,, is remarkably similar to the coefficient Cy appearing in the strong current of 
the old Fermi model for beta decay: 


za 
v2 


In fact, the coupling constants for muon decay and neutron decay differ by only 
2.290: 


Ti = — (Cv bpyutn + CavpYuVsWn) (11.74) 


Gu Gn _4 0% (11.75) 
Gu 


To explain this, we assume that the strong electromagnetic current, which 
transforms like Jew under SU(2), must be part of the same SU(2) multiplet as 
the strangeness-preserving hadronic weak currents J,!*/? and hee Thesxevic 


hypothesis simply says that J,|*{* is conserved: 
CVC® Dele = 0 (11.76) 
just like the strong electromagnetic current, which can now be written as: 


1 


V3 


JE, = Vs" + —V," CUiETT) 


Since the electromagnetic current and the hadronic weak current Jie * now trans- 
form as part of the same SU(2) multiplet, there should be relations among the 
couplings for this current. This simple observation has had experimental success, 
for example, in explaining the beta decay rate of pions. 

From the point of view of the quark model, CVC has a simple interpretation. 
The CVC relations can be derived by writing down the quark representation of 
the various currents: 


ie 1 3 
dy"d= giv 4 tay" 4 
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(where the ellipses represent the strangeness-changing part of the hadronic weak 
current, which we will discuss shortly). Written in terms of the quark fields, 
then it is obvious that the electromagnetic current and the strangeness-preserving 
hadronic weak current are part of the same SU(2) multiplet. Since SU(2) is a 
reasonably good symmetry of QCD, we expect CVC to hold (but be broken by 
electromagnetic interactions and the fact that m, 4 mq). 


5.2 POAC 


Now let us assume, because of approximate chiral symmetry, that a modified 
current conservation rule applies for the axial current A/, as well. The approximate 
SU (2) ® SU(2) chiral symmetry of QCD allows us to write down new relations 
based on the PCAC (partially conserved axial current) hypothesis.'°—?' This states 
that the divergence of the axial current is exact in the limit of SU(2) ® SU(2) 
symmetry, but is broken by the quark masses. Phenomenologically, this means 
that the conservation of the axial current is broken because of the small pion mass 
(which is now viewed as a Nambu—Goldstone boson). 

To see how PCAC provides nontrivial relations between scattering amplitudes, 
let us construct the matrix element of the axial current Af, between the vacuum 
and a pion state |7°). This matrix element can be coupled to the matrix element of 
the leptonic weak current, so that it governs the decay of the pion into an electron 
and neutrino. By Lorentz symmetry, this matrix element can only be proportional 
to the momentum of the pion: 


(0|A4,(0)|2°(p)) = ifnd” py (11.79) 


where f,, is the pion decay constant, and experimentally it is equal to 93 MeV. 
We normalize the pion state by: 


(0|67(0)|x"(p)) = 8” (11.80) 
Now let us take the divergence of this equation: 
(08% AZ (0)|7°(p)) = 8m? fr = farm? (0|67(0)|"(p)) (11.81) 


where we have integrated by parts. It is therefore reasonable to set, as an effective 
relation: 


Qh A = f,m2¢° (11.82) 


which is the PCAC hypothesis. In the limit of vanishing pion mass, we have 
exact chiral symmetry, and hence these equations give us an exactly conserved 
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axial vector current in this limit. (From the point of view of the quark model, one 
can explicitly take the derivative of the axial current and one finds that it is not 
conserved. The right-hand side of the equation is proportional to gA“ysq, which 
has the quantum numbers of the 7 meson. When matrix elements of this relation 
are taken, one can, with a few assumptions, show that the PCAC relation holds.) 

One of the great successes of PCAC was the derivation of the Goldberger— 
Treiman relation.” The origin of this new relation was simple. The PCAC 
hypothesis links the axial current to the pion field. By taking different matrix 
elements of the PCAC relation, we can derive different relations between different 
physical processes. For example, if we take the matrix element of the PCAC 
equation between neutron and proton wave functions, then we can establish a 
relationship between the pion—nucleon coupling constant g,\j and the decay 
constant of the pion f,. Thus, PCAC is able to link two unrelated physical 
processes, pion—nucleon scattering and pion decay. 

To see how this happens, let us take the matrix element of the axial current 
between neutron and proton states. The only axial vectors that we have at our 
disposal are g,, ys and y, ys. Thus, the matrix element must have the following 
form: 


(p(k ALY? |n(k)) = p(k’) [Yursga(q?) + quysha(q?)| untk) (11.83) 


Now take the matrix element of the pion between the proton and neutron states: 


I\) a+ Z oS ! 
(p(k')|b* |n(k)) — YP exwnlqitig lk Wsun(k) (11.84) 


Pee 
which is dominated by the pion pole term and where the coupling constant 
8xNN(q°) is related to the physical coupling constant for pion—nucleon scattering 
by: 


8nNN = &nnn(m?) 
es 
i ~ 14.6 (11.85) 


Now let us put everything together. Let us take the divergence of the matrix 
element of the PCAC equation. This easily gives us: 


2f, xm OR 2 2 2 
2 + m2 87NNG| )=2Mgaq")+q°ha(q’) (11.86) 
Let us now set g? = 0, and make the crucial assumption that g,,v(q7) does not 
vary much between qg* = 0 and the pion mass squared. (This assumption is based 
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on the fact that the pion ae is small, and that the analytic behavior of the function 
is relatively smooth in q’ space.) 

When we make this assumption, we find the celebrated Goldberger-Treiman 
relation, which agrees well with experiment: 


Ta NN 


um ~ 8409) (11.87) 


Experimentally, g4(0) ~ 1.22, while fpg;nn/mn ~ 1.34. 

This relation, we saw, depended crucially on the mass of the pion being small; 
that is, the masses of the u and d quarks are relatively small on a hadronic scale. 
However, the mass of the strange quark is larger than the others, and hence we 
expect the approximation to be less reliable for the K meson. 


11.5.3 Adler—Weisberger Relation 


Finally, one of the most important relations that one can derive from the current 
algebra is the celebrated Adler-Weisberger?** sum rule, which relates the integral 
of pion—nucleon cross sections to known form factors. 

Our goal is to write the pion—nucleon scattering amplitude in terms of the scat- 
tering of nucleons and axial currents using current algebra and PCAC. Therefore, 
we wish to study the scattering of nucleon N and the axial current: 


N(pi) + A2(qi) > N(po) + AS (q2) (11.88) 


where p; +4, = P2 +42, and where we will eventually set g; = q2. The matrix 
element for this process is given by: 


ie = f tee (W(palT As (x)A?(O)|N(p1)) (11.89) 


Let us contract this amplitude with g“q’. By integration by parts, we can 
convert this into a divergence. Each time the derivative hits a 6 function within 
the time-ordered product, it creates a delta function. Then it is not hard to show: 


gigiT® = i ii d*x cL vcaniazena' atone) 


— igh (N(p2)|d(xo)[AG, AZ@)IIN (p1)) 


+ (N(p2)|6(x0)[Ag(x), 9” Av(O)]|N (p1)) (11.90) 
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In general, almost nothing is known about the left-hand side of this equation. 
However, the three terms on the right-hand side of this equation can, by PCAC 
and current algebra, be reduced into known quantities. 

For example, the first term on the right-hand side of this equation can be related 
to the pion—nucleon scattering amplitude via PCAC. Using the LSZ and PCAC 
relations, we can rewrite the usual pion—nucleon amplitude as: 


ab 
TIN 


i) dx e'%*(q? — m2)(q3 — m2) 
x (N(p2)|T (x)? (0)|N(p1)) 


i(gh — mq} —m2ym4 fz? [dx ee 


x (N(p2)|T[a" AG (x)d* A2(0)]|N(p1)) (11.91) 


The second term on the right-hand side of Eq. (11.90) can also be reduced if 
we use the following current algebra relation: 


5(x0)[A8(0), AS(x)] = —i3(xo)e™ VE(x) (11.92) 


(There is a potential Schwinger term in this commutator, but one can show that it 
cancels out.) 

Because the iso-vector current appears on the right-hand side, its matrix ele- 
ment is proportional to tT, so we can now reduce this expression down to: 


ig" J d*x e!4* (N(p)|5(x0)[A2(0), AS. (x)]|N(p)) 


eabe 


qhiu(p) Yt u(p)/2 
2p qe" ¢ /2=— wits (11.93) 


where v = p - q. 

The third term on the right-hand side of Eq. (11.90), which we will call 02°, 
can also be simplified. It is easy to show that this term is symmetric in a and b. 
(To prove this, we drop the term with the integral over the spatial derivative 3! A? ; 
since we assume that the fields vanish sufficiently rapidly at infinity. Then we are 
left with a commutator between Aj and OAC By integrating by parts, we can 
move 0° to the other current. Then by reinserting the spatial derivatives, we find 
that of? = 0 ) 

Putting everything together, we can write Eq. (11.90) as: 


qq’ TH = =i(q* —un2 mapa te + ivi? .7)/2 = 168 (11.94) 
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Next, we wish to reduce the left-hand side of this equation. In the low-energy 
limit ¢ — 0, the term that dominates this expression consists of the one-nucleon 
pole term, since the Feynman propagator of the one-nucleon pole term diverges. 
Using Feynman’s rules, we can add the two one-nucleon pole diagrams that 
contribute to the pion—nuclear amplitude in this limit. 

The poles coming from the two graphs are proportional to 1/[(p+q)* — M*] = 
1/(+2v + q), where v = p-q. To calculate the residue of these two pole terms, 
we use the fact that the matrix element of A‘, between two nucleon states is given 
by EquG@hh.83). 

Using the Gordon identity, we find that the sum of the two one-nucleon 
exchange graphs becomes: 


GG Qty, ([e". 0 WW =O ge Mage at aa 
~ igtv{r?,t°]+--- (11.95) 


where g* < v = p~-gq for small g. 

To simplify matters, we will be interested only in the amplitude that is anti- 
symmetric in a and b (so we can drop oe ). We will make the following isotopic 
decomposition of T”’: 


p= Tt Set Te (11.96) 
aN 


We are only interested in T~. 
We can now put all the pieces together in Eq. (11.90). Taking the limit as 
q? — 0 in the expression for 77? (g, v), Eq. (11.90) becomes: 


lim v-'T~(v, 0) =(1—94)/f2 (11.97) 


This is our primary result. However, left in this fashion, this sum rule is rather 
useless. In order to make comparisons with experimental data, we must rewrite 
this in terms of measurable cross sections. Since T~ is analytic and odd under 
v — —v, v—!T7@ satisfies an unsubtracted dispersion relation: 


= foe) _ / / 
T~(v, 0) n =f Im T at es (11.98) 
v eae Vis 
Putting v = 0, we arrive at: 
2 [ove] 
= =1+ au / le Im T (v, 0) (11.99) 
84 TENN Jw Y 


where we have taken advantage of the Goldberger-Treiman relation. 
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Using the optical theorem, we can write: 
Im T~(v, m2) = vlo™ °(v) — 07 ?(v)] (11.100) 


Inserting this back into the dispersion relation, we find the Adler—Weisberger 
relation: 


2 (oe) 
es Pg i [or ? 07"? (11.101) 


Putting the experimental values into the relation, we can solve for g4, which yields 
1.24, in good agreement with the experimental value of 1.259. 


11.6 Mixing Angle and Decay Processes 


In the previous section, we saw how the hadronic—hadronic part of the effective 
Standard Model action in Eq. (11.57) gave us a wealth of weak interaction relations 
that agreed well with experiment. In this section, we will examine the hadronic— 
leptonic part of the effective Standard Model action. Specifically, we will study the 
decays of hadrons via the weak interactions, which is mediated by the hadronic— 
leptonic effective action. However, since there is a vast number of decays, we will 
not catalog them. Although the Standard Model gives us the ability, in principle, to 
calculate them all, we will only use the Standard Model to make certain qualitative 
observations concerning these decays. 

In the 1960s, many of these important decays were carefully experimentally 
studied, although there was no comprehensive, underlying explanation for their 
behavior. From the perspective of the Standard Model, many of the mysteries of 
these decays can be easily explained. In particular, we will be interested in the 
decays of the K mesons, which are bound states of the strange quark with the u or 
d quarks. The addition of the strange quark to our discussion, however, brings in 
an important complication: the quark mixing angles. Since the weak interactions 
do not respect chiral SU(3) ® SU(3) symmetry, there is no unique way in which 
to insert the strange quark into the Weinberg—Salam model. In principle, since 
the d and s quarks have the same charges, and since the weak interactions do not 
respect global SU(3) symmetry, there is nothing to prevent the d and s quarks 
from mixing within the same SU(2) doublet. We can parametrize this ambiguity 
by taking the following SU(2) doublet: 


u 
( d cos 0c +5 sin 6¢ ) (11.102) 
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where Oc is called the Cabibbo angle.?* The Standard Model does not explain 
the origin of this mixing. The Cabibbo angle, however, allows us to parametrize 
our ignorance. Like the quark masses. the Weinberg angle, etc., it is one of the 
many undetermined parameters within the Standard Model (which indicates that 
the Standard Model is only a first approximation to the correct theory of subatomic 
particles). Experimentally, we find: 


sin 6c = 0.231 + 0.003 (11.103) 


so that the Cabibbo angle is equal to 6¢- ~ 15°. 

In the Weinberg—Salam model, the W and Z mesons then couple to the strange 
current, given by iy,,(1 — ys)s times the sine of the Cabibbo angle. If we write 
this in terms of its SU(3) content, this strange current transforms as the 4 + i5 
component. Thus, the vector and axial charged vector hadronic currents can be 
written as: 


tt 


(Rep noe 4415 
Vu cos6cV,,"" + sin6¢ V,; 


Au cos Oc Aj"? + sin Oc Ai” (11.104) 


If we write these charged currents in terms of their quark content, we find: 
Jy = Vy — Ay = 00s 6c (itypd — Uy, ysd) + sinc (dy,s — Hy,yss) (11.105) 


Since the Cabibbo angle is experimentally found to be relatively small, sin 9¢ 
is suppressed relative to cos 6c. The effective action, written in terms of this quark 
representation, automatically reproduces the fact that the | +/2 reactions are larger 
than the 4 + 75 reactions. This is because the u — s quark current (transforming 
like the 4 + 75 current) is suppressed by a factor of tan@c relative to the u — d 
quark current (transforming like the 1 + 12 current). 

Now that we have parametrized the strange hadronic current, there is a wide 
variety of decay processes that can be described by the Standard Model. It will 
be helpful to divide these decays into several classes. 


11.6.1 Purely Leptonic Decays 


These decays involve no hadronic particles at all. For example, the decay of the 
muon is a purely leptonic decay process. 
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11.6.2. Semileptonic Decays 


Semileptonic decays are those that involve both hadrons and leptons. For example, 
the decay of hadrons into leptons is a typical example. These decays, in turn, are 
divided further into two types, AS = 0 and AS #0. 

For AS = 0, the beta decay of the neutron is one of the most important 
examples. Other AS = 0 semileptonic decays include the following hyperon and 
pion decays: 

x — Ater+v 
nm — wo+eto (11.106) 


Strangeness-changing decays include: A — p+e+v. 
Some other examples of semileptonic decays are given by 773 and K73 decays: 


wt see aby 


Kos oe ay (11.107) 
where / = e, LL. 


In an obvious notation, the decays of the K meson are sometimes designated 
Kes, K 3, tee and Kya. 


11.6.3 Nonleptonic Decays 


These decays involve the decay of hadrons into other hadrons. These decays can 
also be divided into two classes, AS = 0 or AS #0. 
Some of the hyperon nonleptonic decays are: 
ot = p+n® 
| ae ee a 


Q- — A+n™ (11.108) 
Nonleptonic K decays include: 


K° -—» atten” 
K* > xt+n°+7° (11.109) 
Although the Standard Model gives us the ability to calculate these decays 


from Feynman’s rules, we will only make a few brief qualitative observations 
concerning these decays, from the point of view of the Standard Model. 
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First, the Standard Model forbids a large number of decays that cannot be 
described by the current—current effective action. This is totally consistent with 
the experimental data. Second, the various rules that have been accumulated over 
the years (such as the A/ = 5 rule) can be explained qualitatively by analyzing the 
isospin nature of the currents of the Standard Model. Third, we find that certain 
decays are suppressed relative to others because the Cabibbo angle is small. 

As an example, consider the decays: 


ma > no ne ae 


Ko =s oe (11.110) 


The first transition from * to 2° is mediated by a current transforming as 
1 +2, while the transition from K* to 2° is mediated by a 4 +5 current. Thus, 
we expect that the coupling constants for these decays to be related to each other 
via the Cabibbo angle, which agrees with experiment. 


11.7  GIM Mechanism and Kobayashi-Maskawa Matrix 


Experimentally, there is very strong experimental evidence that strangeness chang- 
ing neutral currents are suppressed. For example, we have the experimental 
results: 


['(K? > wtp) 


105: 
T(K? —all) 
T(K+ = xv) a 
ae eee 0.6 x 10 1.111 
T(K= — all) x ere? 


The numerator is sensitive to the existence of a weak current that couples to 
the strange quark s, is electrically neutral, and changes the strangeness number. 
The fact that these processes are extremely rare indicates that such currents should 
be absent in our action, at least to lowest order. 

One of the triumphs of this simple picture is the success of the GIM (Glashow— 
Iliopoulos—Maiani) mechanism,” which uses a fourth, charmed quark to cancel 
such currents at the tree level. 

So far, the model that we been describing allows strangeness-changing neutral 
currents, which arise when we analyze the following part of the neutral current: 
J® = d'y,(1 — ys)’. If we expand out this current, we find the piece: 


Sy,(1 — ys)d sin 6c cos Oc (11.112) 
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which introduces strangeness-changing processes through the s — d coupling. We 
want to cancel this term. 

To explain the absence of such a current, we will use the fourth charmed quark, 


which will give us a global SU(4) flavor symmetry. The hadronic weak current 
can now be represented in terms of four quarks as: 


Jt = Gul — ys)a*q/2 (11.113) 


where g consists of two SU(2) doublets: 
(; ‘ ( : (11.114) 
s 


d' = dcos@c+s sin6c 


where: 


: —d sin@c +s cos6¢ CITT) 


a 
HW 


The key observation is that the existence of the charmed fourth quark allows 
us to express the neutral current, which is diagonal in the fermion fields, as the 
sum of two terms: 


d’y,(1 — ys)d’ +35’ y,(1 — ys)s’ (11.116) 


Because of the presence of s’ in the neutral current, there is an additional piece 
to the strangeness changing neutral current given by: 


— Sy,(1 — ys)d sin 8c cos 0¢ (11.117) 


If we add the two pieces in Eqs. (11.117) and (11.112) together, we find 
an exact cancellation, meaning that (at the tree level) there are no strangeness- 
changing neutral currents. The mixing of the various quarks therefore gives us 
new physically interesting results. 

(Another way of saying this is that the total neutral current has the quark 
content: d’d’ + 5's’ = dd + 5s if we drop the Dirac matrices. This combination 
is invariant even after rotating the quarks by the Cabibbo angle 0c. The neutral, 
strangeness-changing current vanishes because there is no term proportional to 5d 
in this combination.) 

We can now appreciate the importance of the angle 6c, which not only serves 
to suppress some reactions that occur with sin @c, but also eliminates strangeness- 
changing neutral currents via charm. Given the importance of mixing between 
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generations, let us now try to analyze the question of mixing systematically 
between three generations of quarks and leptons. The Cabibbo angle, for example, 
was the unique way in which to mix two generations of quark flavors. It is possible, 
with a few simple arguments, to write down the complete set of these mixing angles 
for three generations. 

To construct the most general charged weak current, we first notice that the 
u,c,t all have charge 3. while d.s,b all have charge —}. Since the weak 
interactions do not respect global flavor symmetry, there is nothing to prevent 
mixing within these groups of quarks of the same charge. We thus have the 
freedom to rearrange the charge ¢ and the charge —+ quarks into two multiplets 
called U and D, respectively: 


u 
CS tec i: 2 = es (11.118) 


where the space in which we are working is labeled by the generation or family. 
Since we have three families, we can mix the three families of quarks within U 
and D. The most arbitrary mixing between them can be parametrized by: 


U"=MyU; D”"=MpD (11.119) 


where My and Mp are 3 x 3 unitary matrices. 
Then the charged weak current can be written as: 


Ju Uy = ys)D" 


Uy, (1 — ys)M D (11.120) 


where we have defined the Kobayashi-Maskawa matrix?’ as: 
M=MiMp (11.121) 


The unitary matrix M is a Ny x Ny matrix for Ny families. This matrix has, in 
general, NF real parameters. However, since there are 2N; quarks, 2N¢ — 1 of 
these parameters can be absorbed into the quark wave functions. We are then left 
with (N ¢ — 1)? real mixing angles that cannot be absorbed by any field redefinition. 

For Ny = 2, we have one mixing angle, which is the just the original Cabibbo 
angle. In that case, the charged weak current is: 


ee caine d 
J, =(@ Dy, (1 9 ( wean )( (11.122) 


—sin@c cos6c s 
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However, for Ny = 3, we have the possibility of four mixing angles. Tradi- 
tionally, these four mixing angles are parametrized with three angles 6;,i = 1, 2,3 
and one phase 5. Then the matrix M is usually written as: 


1 O 0 Cy Sp eB 
Me = 0 C2 AY) x —-S; Ci 0 
0 -—S2 C2 0 O77] 
Leo 0 1 0 0 
x Onl £8 x | 0 ©Gs “a3 (11.123) 
0 0 eé 0 — $3 C3 


where C; = cos 6; and S; = sin6;. 
Written out explicitly, this is: 


Ci S,C3 S1S3 
M= — §,;C2 C\C2C3 — S»S3e!? C,C2$3 + SoC3e!? (11.124) 
S, Sz —C\ $2C3 = C>S3e!* —C, $283 + C2C3e'® 


Experimentally, the three mixing angles 6; are either smaller than or compara- 
ble to the Cabibbo angle. In the limit 6) = 63 = 0, then 9; reduces to the Cabibbo 
angle. Also, one of the advantages of the KM formalism is that it gives us a 
convenient way of parametrizing C P violation, which is found experimentally in 
K meson decays.!7® The angle 5 gives us acomplex M matrix, thereby violating 
C P invariance. (C P invariance demands that M* = M.) 

In summary, the Standard Model is created by splicing QCD with the Weinberg— 
Salam model. It allows us to unify all known experimental data concerning particle 
interactions via the gauge group SU(3)®@ SU(2)@U(1). The gauge fields of color 
SU (3) are responsible for binding the quarks together, while the gauge fields of 
SU(2) ® U(1) mediate the electromagnetic and weak interactions. Altogether, 
there are quite a few free parameters in the theory: three coupling constants for 
the groups in SU(3) © SU(2) ® U(1), two parameters in the Higgs sector (the 
Higgs mass and the Higgs vacuum expectation value v), Ne + 1 quark parame- 
ters [2Ny quark masses for Ny families and (N; — 1)? KM mixing angles and 
phases], an equal number Nj + 1 of lepton parameters (for massive neutrinos), 
and the angle @9c p (coming from instanton contributions). For a Standard Model 
with massive neutrinos, we thus have 2(NF + 1) +6 free parameters. For three 
families or generations, that makes 26 free parameters. (For massless neutrinos 
and no leptonic mixing angles, we have 19 free parameters.) With so many free 
parameters, the Standard Model should be viewed as the first approximation to 
the true theory of subatomic particles. 
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In the next chapter, we will discuss quantum anomalies that arise in any naive 
attempt to quantize chiral fermions. The marriage between quarks and leptons in 
the Standard Model is not a trivial one, because the anomalies of the leptons in the 
Weinberg—Salam model cancel precisely the anomalies coming from the quarks. 


11.8 Exercises 


1. Calculate the tensor product reduction of 3 ® 8 and 6 ® 6 for SU(3) using 
Young tableaux. Identify the dimension of each of the Young tableaux in the 
decomposition. 


2. For SU(4), calculate the decomposition of: 


15 @ 15 @ 15 Cht125) 


3. If we adopt the SU(3) particle assignments in Eqs. (11.9) and (11.10), show 
that the meson and baryon matrices are given by Eqs. (11.12) and (11.13). 


4. Using Feynman’s rules, prove Eq. (11.30). 


5. Why must a Schwinger term exist in Eq. (11.73)? Hint: assume that the 
Schwinger term is missing, and then prove: 


fm) 
iH 


(O|[Jo(x, t), 8oJoCy, t)]|0) 


: iPr (XY) 4 e7tPa(&—y) 
XI 


x En|(O| Jo(0)|n) |? (11.126) 


Then prove that this relation shows that Jo = 0, and so, by contradiction, there 
must be a Schwinger term. 


6. Let the electromagnetic current Je, be sandwiched between two proton states. 
By invariance arguments, the most general matrix element is given by: 


(p',s'|JE()|p.s) = &? Pap’, s')(Fig?yy" 


SEG) 8 2\ 
+i ot gy + Fa(g?)a" ul, 8) 
Cini? 7) 
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and q,, is the difference in momenta. Using the fact that the electromagnetic 
current is conserved, prove that F3(q*) = 0. Using time-reversal invariance, 
prove that F; and F> are both real. 


7. Write down an explicit matrix representation of the generators of SU(5) and 
SU(6) using the algorithm mentioned in the book. 


8. Can the bottom quark and top quark be added and still have no flavor changing 
neutral currents? Examine the KM matrix. 


9. Prove Eq. (11.90). 


10. In the Standard Model with two generations, we could have mixed the u and 
c quarks together as well as the d and s quarks. This would give us two 
Cabibbo angles. How do we reconcile this with a single Cabibbo angle? 


11. Since the experimental evidence points to three quark colors, why cannot 
the color gauge group be SO(3) instead of SU(3)? SO(3) would appar- 
ently satisfy many of the experimental tests. (Hint: analyze if the triplet 
representations are real or complex for the antiquarks.) 


12. Show explicitly how the N; x Ny K—-M matrix, with N ; real parameters, can 
be reduced to a matrix with only (N¢ — 1)? unknowns after a re-definition of 
the quark wave functions. 


13. Let the matrix element of the strangeness-preserving vector hadronic weak 
current between neutron and proton states be: 


Be ds 


(Pile aa G (q?) + io", + faa") alt2y (11.128) 


Using the CVC hypothesis, prove: 


fi’) = FP(q’)— Fiq’) > 1 (as gq? > 0) 
fol@?) = FP(q’)— F3(q@’) > bp — Un (a8 G? > 0) 
Ge) a) (11.129) 


where F; are defined in Problem 6. 


14. For the massive quark model, prove to lowest order that the divergence of the 
axial current can be written as: 


d,A% =2imgrtysq (11.130) 


(where we integrate by parts). What assumptions are necessary to convert 
this into the PCAC relationship? 
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15. Prove that the strangeness-preserving charged hadronic weak current eae : 
and its conjugate are responsible for inducing the following weak reactions: 


mt — vacuum state 
at =o x° 
i a 
yo AC (11.131) 


where we omit the effect of the leptonic weak current. 


16. Similarly, prove that the strangeness-changing charged hadronic weak current 


ion” and its conjugate are responsible for inducing the following reactions: 


K+ —  vacuumstate 
Kt = 7° 
y= 3 an 


cera (11.132) 


where we omit the effect of the leptonic weak current. 


17. Show that the covariant derivatives for the quarks in Eqs. (11.50) and (11.51) 
yield the correct charge assignments of the quarks after symmetry breaking. 


Chapter 12 


Ward Identities, BRST, 
and Anomalies 


12.1. Ward-—Takahashi Identity 


In the case of QED, we found that the Ward—Takahashi identities!* were a power- 
ful way in which to prove important relations between renormalization constants. 
In this chapter, we will examine these identities from the path integral point of 
view, and show how they can be generalized to gauge theories with very little extra 
effort. We also explore perhaps the most convenient way in which to summa- 
rize the information contained within the Ward—Takahashi identities, which is the 
BRST approach. And finally, we will show that these Ward—Takahashi identities 
actually break down in certain circumstances due to anomalies. 

The origin of the WT identities lies in the gauge invariance of the generating 
functional of QED, which is given by: 


(AG! am= | DA, Dy Dw expi f dx [2 Ay) +A, J" +iw + wn] 
Cliz) 


If we make a field redefinition of yw and A, (i.e., make an arbitrary, field- 
dependent redefinition of the fields), we know that the generating functional 
Z(J,,, 7, n) remains the same. This is because changing variables in an integral 
never affects its value. 

We can consider a gauge transformation to be a specific type of field re- 
definition. Thus, the generating functional is trivially invariant under a gauge 
transformation. This will allow us to derive nontrivial identities on the generating 
functional. 
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We know that the action and the measure are all gauge invariant. The generat- 
ing functional is also invariant; so the only terms that are not gauge invariant are 
the gauge-fixing term and the coupling to the sources. The non-gauge-invariant 
terms, we see, must all vanish because of the overall invariance of the generating 
functional under a field redefinition. 

We begin with the gauge-fixed action: 


al ae 1 
a — Faw +9i DP — m)p - 5-0: A) (12.2) 


and the following gauge transformations on the fields: 


dA, = d,A 
by = —ieAy 
by = ieAw (12.3) 


The variation of the generating functional is given as: 


Z+6Z = [ 0A, D4 Dv esp: [ats( 2a t+ Aud? cniorin 
pa 
+d, AJ" —ieAGaw — wn) — Pe: . | 
WS [ra Dy Dv fA (-as" ¥en ~ on - =: 4)) d‘x 


expt f dt [2 And. w+ And +A +d] (12.4) 


To lowest order in A, this can be written as the functional version of the WT 
identity: 


. ae eee 
iA) 3,5" = elgg eo fin) = 
| I (az ng ) + £08 i9e | Zuo te 0 (12.5) 


where we have made the substitutions: 


nO r) = r) 
A —_j——* 3 Saar eee 
it oe yv SR yo is (12.6) 


We now have established the nontrivial WT constraint on the generating functional. 
This identity, however, is not yet written in a form recognizable from our previous 


12.1. Ward—Takahashi Identity 409 


discussion of the WT identities. To derive the first consequence of the identity, let 
us differentiate the expression with respect to J,(y) and then set J = n = 7 = 0. 
The terms that survive are: 


a) ‘i og 


ie a aJ*(x) OF(y) = a’ d(x = y) (17) 


Now let us take the Fourier transform of the expression. In momentum space, 
we have: 


| 
where A,,,, is the connected part of the interacting photon propagator. The general 
solution to this equation is easy to find in terms of a longitudinal and a transverse 
part: 

Aww(k) = —akyky/k? + (Buy — kukv/ PYF) (12.9) 
for some function f(k*). This means that the longitudinal part is unchanged by 
higher-order corrections, as expected. 

However, to make contact with the Ward—Takahashi identity found earlier in 


QED, it is necessary to convert the constraint into one for the proper vertices. We 
recall first that Z = e'”, so we can write the identity as a constraint on W: 


ry r) 1 r) 
; ee gta |W, nomiao os 12.10 
E (nz n=) = ie | (Ju, 7,7) lb ( ) 


The next step is to convert the identity to a constraint over I’, the generator of 
proper vertices. As usual, we now make a Legendre transformation on the fields: 


Py GW Wut — f dbx (Aud” av +n) (12.11) 


Then we can make the following substitutions: 


ce — —_ = Le 

5 Ju Aus Sqn 

5W 60 

= =. = a 

én oy 

ee ae (12.12) 
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Making these substitutions, we then have the formula for the proper vertices: 


é6r 


. or 
"A, -ie(vT oy ) -ateor4, =0 (12.13) 


éw by 
This is the form for the Ward—Takahashi identity that yields the various identities 
found in QED. Notice that by repeated differentiation of this identity, we can 
derive more and more complicated versions of the WT identity. 

As an example, let us take the simplest case, where we differentiate the above 
equation by y(x) and y(y) and set Ww = w =A, = Oat the end of the calculation. 
After differentiation, we find: 


83 Z 
0 Lee oe ee ee a a 
| - Same) ie( 00 — TEP) 
é é T 0 
ye ee a ine 12.14 
0 Deg mEGH)) OOM is 


where I°(0, 0, 0) equals '(A,, w, w) where all fields have been set to zero. 
The Fourier transform of I’, in turn, can be related to both the electron propa- 
gator S, as well as the proper vertex function I’, via: 


[as ie eiP'x— Hep) gen oe he STO, 0, 0) 
dy (x)dy(y) 


= (27)*8(p' — p)iSr(py' (12.15) 


(where the prime indicates that we are taking the interacting electron propagator) 
and: 


/ ax d‘y d‘z eilp'x—Py—qz) __5TO,0,0) 
by (x) dy (y) 5A*(z) 
= ie(2n)*8(p' — p—@)T,(p, p',g) (12.16) 


Taking the Fourier transform of the Ward—Takahashi identity, we have: 


q’T (p,q, p+9) = Sp (pt+q)— Se \(p) (12.17) 


which is the more familiar form of the identity found in Chapter 7. 
If we take the limit as g,, — 0, we find: 


T 


D.(p, 0, p) = (1248) 


my | 
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12.2 Slavnov—Taylor Identities 


As in the case of ordinary QED, we may derive a set of identities on the generating 
functionals of gauge theory. However, the corresponding identities, called the 
Slavnov—Taylor identities,**, are much more involved. Later, we will use what 
is called the BRST formalism in order to simplify the complications found in the 
Slavnov—Taylor identities. 

The Slavnov—Taylor identities are complicated because of two factors: the 
nonlinear nature of the gauge transformation, and also the presence of the Faddeev— 
Popov ghosts. 

The generating functional (for just the gauge field) can be written as: 


1 1 
Z(Jy)=N i DA, Arp expi fas (-j08 — mo is aact 
(12.19) 
As we saw earlier, Arp can be written as the determinant: 


Arp = det (M)x y.ab (12.20) 


where the M matrix is defined in terms of the gauge-fixing function F'“(A,,): 


SFA, = f aty M(x, yA) (12.21) 


As before, we know that Z( "he is invariant under a field redefinition. Since 
a gauge transformation is also a field redefinition, it means that the generating 
functional is gauge invariant. Thus, 5Z = 0 under this transformation. This 
means: 


0 = 8Z(J2)= / DA, Arp expi il d'x| Z(A) + JZ] 


x Game / d*yM (x, YN") + J,D"A) (12.22) 
a 


For small A, we can bring the last term in the exponential into the integrand. 
Finally, we can choose A + M~'A. With this substitution, we now have the new 


identity: 


I c aid | 4 a pyab ar , b )] = 
ae 2) — pia ya Mes Z=0 (12.23) 
ee ( = a tae ea Sa 
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This is a rather complicated nonlinear identity, and in general it is quite difficult 
to extract simple identities on the proper vertices. Also, this identity is rather 
awkward to work with from the point of view of renormalization theory. 

There are, however, some clever tricks that one can use in rendering this 
problem tractable. The key to this simpler construction is to use the BRST 
construction. 


12.3. BRST Quantization 


After fixing a gauge, we know that a gauge theory is no longer gauge invariant. 
All the local gauge invariances have been removed from the theory. Thus, it is 
rather surprising that, even after gauge fixing has been performed, a new symmetry 
arises involving the Faddeev—Popov ghosts. This new symmetry, however, is a 
global one, and hence no new degrees of freedom can be eliminated from the 
gauge-fixed theory. 

We recall that the gauge-fixed action is given by: 


Sie —7(F3,) - = (3 - A) ~ 973" Dyn@ (12.24) 
The original action, of course, was invariant under: 
5A‘ = dud? af AAS (12.25) 
Now make the replacement: 
A? = —n7X (12.26) 


where 7° and A are both Grassmann variables and A is constant. Then the gauge- 
fixed action is invariant under a new global symmetry”: 


1 

oA = = 0 
& 

bn? = 1 pabe pb yey 
2 


1 
She = See 
i age Aw (12.27) 
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To prove the invariance of the action, we note that: 
6.26F = = (a" Aw) te Dyn?) A (12.28) 
and: 
85 Lipp = — (87°) 04 Dun? — 9°9"5 (Dun’) (12.29) 
Adding these together, we find: 
SF = —7°9" (8D, n") (12.30) 
We can use also prove that: 
5(Dyn*) = 8( fF?’ n°) = 0 9412.31) 
where we have used the Jacobi identity on the structure constants: 
favk pkde 4 padk pkeb 4 pack pkbd _ ¢ (12.32) 


Thus, the action is BRST invariant. 
Since the original variation was nilpotent, one can also show that: 


Soar = 0 (12.33) 


Using Noether’s method, we can also construct the current that corresponds 
to the BRST variation. Using the Noether prescription, we find: 


54 Sprsroj 
— 53,6; OA 


Iu 


(-Fi,D” a 2 OA fn 1) (12.34) 
From the Noether current, we can also construct the BRST charge: 
Oprst = i Jo d?x (12.35) 


which satisfies the nilpotency condition: 


Onrsr =0 (12.36) 
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In general, the states of the theory are constructed from all possible monomials 
that one can construct out of 7% and 7%. Thus, the Fock space has increased 
enormously with the presence of these ghosts and antighosts. However, one can 
show that the physical state condition (similar to the Gupta—Bleuler condition) is 
given by®: 


QOzrst|V) =90 (12.37) 


The states that satisfy this condition are the physical states of the system. We now 
have a compact and elegant statement of the physical state condition. 


12.4 Anomalies 


Because of the subtle manipulations that must be performed on potentially diver- 
gent quantities when we renormalize a theory, there may be unexpected surprises. 
One of these is the existence of Adler-Bardeen—Jackiw (ABJ) anomalies.”° 

An anomaly is the failure of a classical symmetry to survive the process of 
quantization and regularization. For example, in a chiral gauge theory, we naively 
expect axial currents to be conserved. However, we will find that actions that are 
classically chiral symmetric can develop anomalies that spoil the conservation of 
the axial current. 

If we start with a gauge theory that naively is invariant under axial gauge 
symmetry: 


pa Ory (12.38) 
then we can define: 
Vu(x) = W(x) yy W(x) 
Aux) = W(x)ynys(x) 
P(x) = Wx)ysw(x) (12359) 


Using the naive equations of motion, we can easily show: 


av, = 0 


a" A, 


2im P(x) (12.40) 


The last equation vanishes in the limit of zero mass, that is, when chiral 
symmetry is restored. It appears as if we have an exact conservation of both the 
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vector current and the axial current in the zero mass limit. However, we will see 
that this current conservation is anomalous, that the divergence of the axial current 
is not equal to zero, even in the zero mass limit. 

Specifically, we will examine the “triangle graph,” which consists of an internal 
fermion loop connected to two vector fields and to one axial vector field. This is 
appropriately called the V-V-A triangle graph. 

If we perform power counting on this graph, we find that the integration 
over dk gives us four powers of momentum in the numerator, but the fermionic 
propagators only give us three powers of momentum in the denominator. Thus, 
the graph should diverge linearly. 

The origin of this anomaly is rather subtle. In performing the integration 
over the loop variable, we will cancel certain graphs by performing a shift of the 
integration variable. Normally, one expects that integrals like this vanish: 


be dx [f(x +a) — f(x)]=0 (12.41) 


because, by shifting x +a — x, we get an exact cancellation. However, we have 
tacitly made certain unjustified assumptions. 
To see how this integral may not vanish, let us power expand it: 


oo 2 
/ dx (ar'w+ shes) 


2 
= a[f (co) — f(—oo)] + a (oo) — f’(—oo)] +++» (12.42) 


If the integral of f converges, then there is no problem in setting the above equal to 
zero. However, if the integral diverges linearly, then Eq. (12.41) need not vanish. 
In fact, it can equal a[ f(co) — f(—oo)]. 

We can also generalize this ambiguity to arbitrary (Euclidean) dimensions. 
Let us define the function: 


IN / aN x[f(x+a)— f(x)] 


[ars jana. fe) + (aay) F(x) eee ] 


at = F(R)Sy(R) (12.43) 


In performing the integral over the volume element d N x, we used Gauss’s theorem 
to drop all but the first term in the expansion. In the last line, the volume integral 
reduces to a surface integral over a large hypersphere with radius R, surface area 
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Figure 12.1. The V-V-A and V-V-P triangle graphs, which give rise to the anomaly. 


Sy(R), labeled by the vector X,,. For the case of four dimensions, we can perform 
the integral, and take the limit as the hypersphere’s radius R expands to infinity. 
The result is: 


A(a) = lim (2in)a"X,,R? f(R) (12.44) 


Now that we see the inherent ambiguity in shifting the integration variable in a 
linearly divergent integral, let us apply this knowledge to gauge theory. To begin, 
let us examine the following two matnx elements, corresponding to the V-V-A 
and V-V-P triangle (Fig. 12.1): 


Tyvp (ki ’ ko, q) 


i fat dtxa(O1T [VuCxr)Vo(xa)Ap(0)] [ett 


Tyslkiyko, gq) = i / d* x, d*x2(O|T [Vu(a1) Vo(x2) PO)] [O)el #2 


(12.45) 
Our next step is to differentiate the previous expressions. This will pull down 
a factor of q”, but there are complications when we take the derivative of a time- 


ordered product, which contains theta functions in the time variable. In particular, 
we have: 


8,0(x° — y°) = 8,95(x° — y®) (12.46) 
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This means that: 
dT (Vu(x)O(y)) = T (8“ V(x) O(y)) + LVo(x), O(y)15(x° — y®) (12.47) 


for an arbitrary operator O(y). 
Taking different derivatives of the matrix element, we easily derive: 


ilies = 0 
Fino = 0 
lie = Qa (12.48) 


Written out explicitly using Feynman rules, we find that the matrix elements can 
be written as: 


- d*p i i 
fant) = — | Gee lage Eom 


l 
ek aye ee! 


s {ei A ene v| (12.49) 


It is important to notice that we have explicitly made a shift p — p+a in 
performing the integral where a = wk, + (a — B)k2, where @ and 8 are arbitrary. 
Normally, for convergent integrals, 7,,,.(a) is independent of a by shifting the 
region of integration. However, we now see that T,,,)(a) is linearly divergent, and 
hence inherently ambiguous. We can, of course, explicitly calculate the value of 
Tvp(@) — Tyvp(0) using the formula derived earlier. 

Dropping the cross term for the moment, we find: 


Tyvp(@) ca Tyvp(0) 


7 ape 20 Tr 1 Se eS a 

~ @ny ap, \pom' peg on’ p-h—m” 
277 a* : 

= -i Gaye im, PD’ PATt (Ya Vp 5p Yv¥8¥u)P* P? p?/ p> +... 


2in7a, . pp ; 
‘a Qry — p? Ale pvps + {ky oH kyo v} 


= €oprpa” [827 + {ky kos @ v} (12.50) 
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This then gives us: 
B 6 
Tyvp(@) <a Tyvp(0) = oF 529 Eurpo (ki — kp) (12551) 


where f is a constant that is not yet determined and is inherently ambiguous. 
Finally, the anomalous Ward—Takahashi identity can be written as: 


jl = 
Q? Typ = 2MTyy — “HF eunopkf Ke (12.52) 


This equation, which expresses the divergence of the axial current, implies that 
axial current conservation is anomalous. 

At this point, the value of f is arbitrary. 6, in turn, can be calculated by 
the fact that we would like to preserve the vector current conservation. Thus, 
we demand that ki’ Typ = kj Tuvp = 0, even though they, too, are anomalous. 
However, demanding that the vector current be exactly conserved serves to fix the 
ambiguity in B. 

To fix the value of 8, we now calculate the anomaly coming from the vector 
current and then set it equal to zero. 

We must thus calculate: 


ky Pus) = (— vfx Fm = ne? Fe 7 aan Mi) 


+t (— — ) (12.53 
f= gan ean sa 


Using the identity: 


Ki 


(J —m) —[Y- fh) - ml] 
[(Y— hk) —m) —[y— 4) —m) (12.54) 


we can write the expression as the difference between two shifted integrands, 
which in turn allows us to write everything in terms of a limit on a hypersphere: 


d*p 1 i 
peat Be as pas Ese Pe Raat ARE 
ene ine (ung pe os ima) 


YEN mn pam 


ea fe — Gass kh +m)y(p + mi) 
(2x)4 OPo \ [(p — ka)? — m?](p? — m?) 
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2in?k? 


(nyt lim, rT (SY Ya Yop KS PP 
= Seesqusete (12.55) 
This means that: 
Et Tysp(B) = oe evpackfKS (12.56) 


The key point is now this: For arbitrary values of B, it is impossible to keep 
both vector current conservation and axial vector conservation. We will keep the 
vector current conserved and push the anomaly entirely onto the axial current 
conservation. 

With the choice 6 = —1, we find: 


i| 
Q° Typ = 2MT yy — 52 Ewvorky ky (12.57) 
Written in « space, the ABJ anomaly can be summarized as follows: 
1 - 
0” Ay = 2imP(x) + Byafr (12.58) 


where Fy = 5€yvapF. 
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For the non-Abelian case,’ we must study the following anomalous V-V-A graph: 


72 (k1, ko, q) =i / d4xy d4*x2 (OT [V,2(01) V2 (x2) AS (0)] [Oe re” 


pvr 
(12.59) 
The anomalous Ward—Takahashi identity becomes: 
oP Tae = ImTge — seunpok RE D™ + --- (12.60) 


where: 


Dw = str (ee, t?}r°) (12.61) 
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(The anticommutator in the trace comes from the fact that we must add two 
triangle diagrams together to produce the anomaly. The only difference between 
these two diagrams is that the a and b lines are interchanged; so this explains the 
anticommutator.) 

The origin of the axial anomaly, however, has much deeper significance at the 
quantum level, persisting for every possible regularization scheme. The method 
of Pauli—Villars regularization, for example, violates chiral symmetry, because we 
have explicitly added a massive fermion into the theory. Thus, a theory that is 
classically chiral invariant does not necessarily maintain chiral invariance if we 
use the Pauli—Villars regularization method. 

This anomaly persists even if we use other regularization schemes. For ex- 
ample, in the dimensional regularization scheme, there is no higher-dimensional 
counterpart of ys, so we expect that dimensional regularization will also spoil 
chiral symmetry. 

In Chapter 15, we will discuss yet another regularization scheme, putting 
space-time on a discrete lattice. In contrast to the previous regularization schemes, 
lattice regularization does preserve chiral invariance at every step of the transition 
from the classical theory to the quantum theory. Putting fermions on a lattice does 
not spoil chiral symmetry at all. Then, the theory is chirally invariant even as we 
perform the quantization program. (However, there is still a catch to this, as we 
shall see.) 


12.6 QCD and Pion Decay into Gamma Rays 


One of the earliest discoveries in this area was the realization that these anomalies 
may actually solve the 7 — 2y puzzle. Historically, it was noticed that 2 meson 
decay into two photons was not occurring with the expected rate. However, by cor- 
recting for the presence of an anomaly, we can obtain the experimentally observed 
decay rate. (The presence of the anomaly does not necessarily spoil renormal- 
ization, because here there is no Ward—Takahashi identity that is destroyed, since 
there is no local conserved axial current.) 

The Feynman graph that mediates pion decay into two photons is a triangle 
graph, in which the two photons couple to an internal fermion loop via two currents 
J,,. The pion also couples to this internal fermion loop, but, because of PCAC, 
the pion couples via the axial hadronic current. Thus, we have the classic V-V-A 
triangle, which we know is anomalous. 

The decay of a pion of momentum p into two photons of momenta k,; and ky 
is denoted by: 


m(p) — y(k1) + y(k2) (12.62) 
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and governed by the following matrix element: 


(v (ki, €1), vk, €2)[2°(p)) = i(20)"Tyv(p, ki, ke) 


x d(p—ky — ke (ky ed(ko) (12.63) 


The tensor [’,,,, in turn, is given by: 
Dav(p. ky. ko) =e? i ae O\T [JOG aye (12.64) 


where J, is the electromagnetic current. 

By Lorentz invariance, we know that the only tensors that we can use to 
construct this matrix element are k), k2, and €,,,,). Since the pion is a pseudoscalar 
particle, we must choose: 


Puv(ps 1, ka = teak, ey”) (12.65) 


Next, we can use LSZ methods to reduce out the pion field appearing in the 
state vector |77(p)). Then we use PCAC, given by 0° A% = fms, to replace the 
pion field with the divergence of the axial current. Then our tensor becomes: 


ie*(q? — m2) anaes 
sei ws = 53 d’ a’ iky-y—ip-x 
Pyuv(p, ki, k2) eee il xd'ye 


x (O|T [8* AX(x)J.(y)J.0)] [0) (12.66) 


(Because we are analyzing the 7° field, we must use the third component of isospin 
A}, in the PCAC relations.) 

Our goal is to derive a low-energy theorem on this matrix element. To do this, 
let us define a new matrix element, which will prove useful in our discussion: 


Duva(ps ki, ko) = / d’xd*y el? *OlT [A,(x)J.(y)J.(0)] |0) (12.67) 


The trick is to find a relationship between our I’,,, and the new [’,,,, that we have 
just written. 

Next, we will hit this tensor with p*. Contracting the left-hand side with 
p” is equivalent to taking the derivative with respect to x of the right-hand side. 
However, taking the x derivative of the right-hand side will pick up the derivative 
of a 6 function appearing in the time-ordered product, which in turn will yield 
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delta functions. Performing the derivative, we find: 
PVP: kik) = I if d‘x d’y elkz-y—ip x 


x ((ol7 [a* A3(x)J.(y)J,.(0)] |0) 
+ (O|T {5(%o — yo) [Aa(x), Jv(y)] J,.(0)} 10) 


+ (O|T {8() LAG), JuO4()} 10)) (12.68) 


In the limit as p — 0, the left-hand side of the equation vanishes. Also, 
the two commutators on the right-hand side of the equation also vanish, using 
the current algebra relations. Thus, all terms have vanished except I’,,,, which 
therefore must also vanish. This means that the entire equation has collapsed, 
showing that 2 can never decay to two photons in this limit, which violates the 
experimental data. This problem can be resolved by noting that the Feynman 
graph that dominates this process to lowest order is the V-V-A triangle graph, 
which we know is anomalous. 

Inserting the anomaly back into the previous relationship, we therefore have: 

; ie’D aap 
pay Dyv(p, ki, ko) = Dad f, «ure ks (12.69) 
where the value of D depends on the fermions moving within the triangle graph. 

Comparing this with our previous Lorentz decomposition of this tensor, we 
therefore have: 


e*D 


(12.70) 


Now let us calculate D. To lowest order, we can assume that the naive quark 
model is correct. Using free-field representations of the currents, we find that the 
electromagnetic current is related to the charge Q matrix by the following: 

Iu(x) = (x) ¥p, Og) (12.71) 
where: 


QO = diag (2/3, —1/3, —1/3) ZZ) 


and that the axial current is given by: 


43 
Aa GO Yurs >I) (12.73) 
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where: 
A3 = diag (1, —1, 0) (12.74) 
Inserting these expressions back into the value for D, we find: 


N A3 N 
Daa ir 1 —)=— : 
5 It ( Ole ) 6 (12.75) 
where N is the number of colored quarks. Assuming N = 3, this gives us a value 
of: 


T(0) = 0.037 mz! (12.76) 
This is to be compared with the experimental value: 
T(m2) = 0.0375 mz! (12.77) 


if we have three colors of quarks. 

Yet another check on the Standard Model is the fact that the anomaly con- 
tribution of the leptonic and hadronic sectors of the Weinberg—Salam model just 
cancel each other. The leptonic sector of the Weinberg—Salam model, by itself, 
is not renormalizable because of the chiral anomaly. However, the true anomaly 
is the sum of the anomalies coming from the leptonic and hadronic sectors of 
the Glashow—Weinberg—Salam model, and these cancel perfectly, giving us con- 
fidence once again of the correctness of the Standard Model. 

To see how this works, let us calculate the anomaly contribution from the 
leptonic sector of the Weinberg—Salam model. In particular, the calculation sim- 
plifies if we just calculate the anomaly coming from the coupling of the Zp with W* 
and W-. (This WWZ triangle graph appears, for example, in neutrino—neutrino 
scattering, where a triangle graph is exchanged between the two neutrinos.) 

Since right-handed fermions do not couple to the W vector meson, we are only 
interested in the left-hand anomaly: 


Terie’, can (12.78) 


For the W mesons, the isospin coupling is easy to find, since they couple to 
fermions via T+: 


vn y: fant (12.79) 


To find the contribution from the Z vertex is a bit more complicated, but it 
can be read off the Lagrangian using Eqs. (10.62) and (10.66). The Z gives an 
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isospin coupling of: 
T. ~~ Sec Ow(T3 + sin” Oy Q) (12.80) 
Now let us insert everything into the anomaly: 
Anomaly ~ Tr [(z3 + sin’ Oy Q){t,, tT }] (12.81) 
All terms in the trace vanish, except the one containing the charge Q, so we have: 


Anomaly ~ ) > Qi, (12.82) 


In other words, the sum of the left-handed changes must sum to zero. However, 
it is easy to see that the sum of the electron and neutrino charge does not vanish, 
and hence the Weinberg—Salam model, for the leptons, is not renormalizable. In 
other words, the leptonic sector by itself is not self-consistent. 

In the Standard Model, however, we add the contribution of both the leptonic 
and the hadronic sector. The right-handed quarks do not couple to the W meson, 
so we only have to sum the contributions of the charges of the left-handed quarks. 
The sum of the two sectors is given by: 


Anomaly ~ Q(e)+ O(v) +3[O(u) + O(d)] = —14+0+3 € = 3) = 0 (12.83) 


Thus, for one generation of quarks and leptons, we have an exact cancellation. 
This result is also welcomed, because it helps to explain the rough symmetry 
in the number of leptons and quarks that have been discovered over the years. 
Every time a new lepton was discovered, a new quark would be discovered soon 
afterwards, and vice versa. From this point of view, we need an exact balancing 
between the lepton and quark sector to give us a renormalizable, anomaly-free 
theory. (However, this still does not explain why leptons and quarks come in three 
distinct generations.) 


12.7 Fujikawa’s Method 


There is another method of obtaining the anomaly that is much simpler and more 
conceptually intuitive using path integrals.'!° We notice that the anomaly arises 
because of a failure of the regularization scheme to accommodate the axial current 
conservation. Thus, we might expect the failure of the symmetry to take place at 
a more fundamental level, such as the quantum measure. 
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Under the chiral gauge transformation: 
belly 
we pes (12.84) 
we wish to calculate the change both in the action as well as in the functional 


measure. 
The action transforms as: 


fas vi Dy > fas vi Dy — fas cate le (12.85) 

where the axial current is given by: 
JE = py ysy (12.86) 

The measure transforms as: 

Dy Dy = det (e'°”) Dy Dy (12.87) 
Normally, we discard the determinant because it appears to be a constant. 
However, closer analysis of this term shows that it is actually divergent, and hence 
requires regularization. This process of regularizing the determinant, in turn, will 
generate the anomaly. To determine the value of the determinant carefully, let us 


introduce a complete set of eigenfunctions ¢, of the operator J): 


Dor {X) = Anbn(x) (12.88) 


We assume that the eigenvalues A,, are all discrete, although this is not neces- 
sary. We will normalize these eigenfunctions as follows: 


[ 4x 4}( m2) = Ban (12.89) 


Then the Dirac spinor can be decomposed in terms of this complete set of eigen- 
functions: 


V(x) = andrlx), b=) bab, (12.90) 


The functional measure can be rewritten as differentials over da, and db,: 


Dw Dy > | [dan |i) ae: (12.91) 
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We are now in a position to determine the determinant in the functional 
measure. The transformation of the field variables is now written as: 


w'(x) = ey +S aldn =e Damm (12.92) 


Let us multiply both sides by } and integrate over x. Then we find: 


/ y 
a, Caan 
m 


| dx Gh (x)eO” ba(x) (12.93) 


Cum 


Ill 


Thus, the change in functional measure is given by: 


[iat = det (Can) [dan (12.94) 


If the determinant of e'?” were equal to one, then the functional measure 
would be invariant under a chiral transformation. However, a careful analysis 
shows that this determinant is not equal to one and, in fact, is potentially divergent. 
(The determinant occurs with exponent minus one because we are dealing with 
Grassmann variables, not ordinary c numbers.) 

For small 6(x), we can make some approximations and rewrite the determinant 
factor as: 


det(G. =: 


—] 
det (Sm +i if A(x)o} (2)756n(0) dx) 


exp (- s iE dx acsheysts) 


exp (-i fu a(x)aCx)) (12.95) 


where we have defined: 


A(x) =) ol @)ysbn (12.96) 


Since the Dw yields the same determinant, we find that the overall measure 
transforms as: 


Dw Dv — exp (-2 / d*x a(x)A()} Dy Dw (12.97) 
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Written in this fashion, the determinant in the functional measure is actually 
divergent, and hence must be regularized. This process of regularization, in 
turn, will generate the anomaly, since the axial current conservation cannot be 
maintained by any regularization scheme. 

To regularize this sum, we will find it convenient to introduce the convergent 
factor exp —(A,,/M)* and take the limit as M — oo. Inserting this converging 
factor into the sum, we have: 


A(x) 


fim, D7 onCadyse" Oo!" dul) 


fee De biCsdyse Pu (12.98) 


where we have replaced A,, with D. 

Since we are taking the trace with respect to ¢,, we are free to change the basis 
of the trace. Using Eq. (8.23), we can change the basis to |k) eigenstates instead, 
as follows: 


ove ee ae i dk (k\n) 
e dak 
a —ik-x 
= ¢ coy (ele) (12.99) 


Then the trace of a arbitrary matrix _Z can be expressed as: 


Tr. A(x) 


S- bh (x) A(x) bn(x) 


Yo (n|x)4(x)(x|n) 


n 


Yonik) | d*k (k|x) (x) (x|K’) if d*k! (k'\n) 


n 


d‘k 
(27)4 


alan) AG al mala (12.100) 
where we have removed the sum over n because | = )/,, |n) (n|. 
The trace over ys can now be written as: 


dk ik-x, ¢~(D/M) 9~ik- 
= li ae 12.101 
A(x) im te f Se se e ( ) 


Next, we must decompose (P)*. Because D,, is an operator, we must be careful 
in handling this expression. This factor can be decomposed into an odd piece 
proportional toe [y”, y’] and an even piece proportional to fy” \ ae 
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The odd piece, in turn, is proportional to [D,,, D,], which gives us F,,,. Putting 
everything together, we now have: 


; d*k 
A(x) = jim Tr (any ¥> oP 


x (-az { [ik + D,(x)| +[y”, YVFu(0} ) 


2 4 
= ‘ ee: 2 1 ut ak —R/M! 
= jim Tr ys (yy ag) (az) 2} Qn 


1 2 
= ~Tequ  Fuw (12.102) 


In conclusion, we find that the trace of ys can be written as: 


1 ae 
A(x) = Tr (5) = — se jel ies (1Z.103) 


Now let us put the total variation of the action and the measure together. From 
Eqs. (12.85), (12.95), and (12.103), we find that: 


Dw Dw fox) _, exp (: i d*x [A(x) - a(x)e,JE]) 


x exp (-2 d‘x me FF) Dv Dv 
(12.104) 
This functional is invariant if we choose: 
(pee = ay, (FES) (12.105) 


87 


which is the same result that we found before in Eq. (12.58). 

In summary, we have seen that the path integral method allows us to gener- 
alize the Ward—Takahashi identities found earlier for QED. These identities arise 
because the generating functional Z(/) is gauge invariant. When applied to gauge 
theory, these identities become the Slavnov—Taylor identities and the BRST identi- 
ties. The BRST symmetry arises because there is a residual (global) symmetry left 
over after the gauge symmetry is broken and Faddeev—Popov ghosts are allowed 
into the action. 

These identities are crucial for renormalization. However, they can be violated 
by chiral anomalies, which must therefore be cancelled. In the Standard Model, 
the anomalies from leptons in the Weinberg—Salam model cancel against the 
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anomalies coming from the quarks, giving us a renormalizable, anomaly-free 
theory. 


12.8 Exercises 


10. 


. For gauge theory, prove that the functional measure for the various fields is 


invariant under a BRST transformation. 


. Calculate explicitly the anomaly contribution of SO(3) and show that it van- 


ishes. 


. Discuss the generalization of the chiral anomaly in higher dimensions, such 


as d = 6,8, 10. What kinds of graphs are divergent? Using the Fujikawa 
method, calculate what the anomalous term to current conservation might 
look like. In 10 dimensions, show that the hexagon graph is anomalous. 


. Fill in the missing steps leading up to Eqs. (12.28), (12.29), and (12.31). 


. A representation A, is called real if there exists a unitary matrix U such that: 


da = —Uatut (12.106) 


Show that the anomaly cancels for a real representation. 


. For the antisymmetric representation of SO(N) defined by Mj, the anomaly 


is proportional to Tr ({May, Mca }Mey). Show that an invariant tensor cannot 
be constructed out of Kronecker delta functions and antisymmetric € tensors 
with the proper symmetry/antisymmetry properties of the anomaly (except 
for N = 6). Therefore, the anomaly vanishes for all SO(N) except for SO(6), 
where we have the invariant tensor €gicdef - 


. Consider a Maxwell field locally coupled to a charged triplet meson field. 


Construct the Ward—Takahashi identity for this theory. 


. Calculate the Ward—Takahashi identity in a theory of spin 3/2 particles coupled 


to the Maxwell field, where the action contains €4"°° WyysDuYpWo. (Note: 
this action is actually inconsistent.) 


. Prove that the Z meson contributes the isospin factor given in Eq. (12.80) to 


the anomaly. 


Prove that the condition Ogrsr|V¥) = 0 eliminates not only the ghost states 
within |W), but also the longitudinal mode of the gauge field, leaving only the 
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transverse, physical states. (Work only the lowest order in the ghost expansion 
of |W).) Show that this condition reduces back to Gauss’s Law. 


11. Fill in the missing steps in Eq. (12.50). 
12. Fill in the missing steps in Eq. (12.55). 


Chapter 13 


BPHZ Renormalization 
of Gauge Theories 


Veltman: | do not care what or how, but what we must have is at least one 
renormalizable theory with massive charged vector bosons, and whether 
that looks like Nature is of no concern, those are details that will be fixed 
later by some model freak. . . 

*t Hooft: 7 can do that. 

Veltman: What do you say? 

*t Hooft: I can do that. 


13.1 Counterterms in Gauge Theory 


The renormalization of spontaneously broken gauge theories, proved by ’t Hooft, 
using powerful techniques developed by Veltman, Faddeev, Popov, Higgs, and 
others, opened the floodgates for acceptable quantum field theories of massive 
vector mesons, which were previously thought to be nonrenormalizable. 

In Chapter 7, we presented the proof of the renormalizability of QED based 
on the original Dyson—Ward multiplicative renormalization scheme. Although a 
number of proofs of the renormalization of non-Abelian gauge theories have been 
proposed, we present two such proofs that are quite general and can be applied to 
a wide variety of quantum field theories, including those that do not have gauge 
symmetries. We will present the proof based on the BPHZ method and, in the 
next chapter, a proof based on the renormalization group. 

The renormalization program for gauge theories proceeds much the same as 


for ¢* and QED; that is, 


1. First, by power counting arguments, we must isolate the superficially divergent 
diagrams, show that their degree of divergence depends only on the number 
of external lines, and that there are only a finite number of classes of these 


divergent diagrams. 
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2. We must regularize the divergent diagrams in order to perform manipulations 


on them. 


3. We must show that we can absorb the divergences into the physical param- 
eters of the system, either by extracting out multiplicative renormalization 
constants, or by subtracting off counterterms. Slavnov—Taylor or BRST iden- 
tities are needed to show that gauge invariance is maintained and that the 
renormalized coupling constants have the correct value. 


4. We then must prove, via an induction argument, that the theory is renormal- 


izable to all orders. 


Of course, we must also check that the renormalization program does not 
spoil the original physical properties of the theory, such as unitarity. For gauge 
theories, for example, the proof that the renormalized theory is unitary is actually 


nontrivial. 


We begin this program by power counting to determine the superficial degree 
of divergence of the Feynman diagrams. We define: 


L 


number of loops 

number of external fermion legs 
number of external vector lines 
number of internal fermion lines 
number of internal ghost lines 
number of three-vector vertices 
number of four-vector vertices 
number of ghost-vector vertices 


number of fermion-vector vertices (13.1) 


By now familiar arguments, we can show that the superficial degree of diver- 
gence of any Feynman diagram is equal to: 


D=4L — 21,4 — ly — 21g + Va + Ve (13.2) 


In addition, we have various identities among these numbers that eliminate all 
internal lines and vertices from D. As in QED, we now observe that two fermion 
lines connect with one vector meson line in a vertex. Thus, we have, as before: 


1 
Vy = ly a 5 Ev (13.3) 
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Each ghost propagator connects onto one end of a ghost vertex, so that: 
Vg =Ig (13.4) 
We also have the constraint that there are no external ghost lines: 
Eqt2I,=4V,+3V34+VetVy (13.5) 


Finally, we can count the number of loop variables in the theory. Each internal 
leg J,4, ly, Ig is associated with a momentum. However, there are restrictions 
on these momenta. Each vertex V}, V4, Vy, Ve contributes a delta function 
constraint that enforces conservation of momentum at that point. We also have 
the overall momentum conservation of the entire diagram. Thus: 


L=I4+ly+1g—Vi-Ve—Verl (13.6) 


Putting everything together, our final result is that the superficial degree of 
divergence is: 


3 
D=4—E,—5Ey (13.7) 


which is the same as for QED, as in Eq. (7.42). 

This means that gauge theory is, in principle, renormalizable. The degree 
of divergence is a function only of the number of external lines on any Feynman 
graph, and it decreases for higher point functions. Furthermore, it is easily checked 
that, as in QED, the classes of diagrams that diverge correspond to the renormalized 
quantities of the theory. Thus, by renorinalizing these physical parameters, we 
can absorb all the divergences of the theory into these parameters. 

Next, we try to isolate the possibly divergent graphs in Figure 13.1. To be 
concrete, let us begin with the effective action defined in terms of the finite, 
physical parameters g and m (in Euclidean metric): 


(ae tml EA: 
he =——9- Ad: A” 13.8 
B q Fu) SE A%0- A? +y(Pr+im)w (13.8) 


By power counting, we can easily categorize which classes of diagrams are 
divergent. To this effective action, we can then add the counterterms. It is 
therefore just a matter of counting to show that the counterterms we must add to 
the action have the form: 

1 


AL eauge = 4'43 a 1)(0,, A$ = a, Aa) —(Z4- Negus! £7 A’ ACa4 Aw 
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Figure 13.1. The set of diagrams in gauge theory which are potentially divergent. 


1 1 
a Daa fe pe AAC ASE AC 5 (2a — 18: Ards AG 
(13.9) 
[We will use dimensional regularization, so we will find it convenient to perform 
all integrations by working in the Euclidean metric and hence some of our signs 
will be reversed due to the metric; that is, iy = —4,] 
Also: 
A eermion = (Z2—l)h Iptim(Zm—Ibyti(Zi—lguPAlbrty"y (13.10) 
and: 
: ; i ear 
Aorot = (Zo — W8un™*Iun® — 5(Z7 — Ygut? FP AL n* BM 9? 
L *a 
—5 (Zs — Dawe! fren? iy Am (13.11) 
where ¢ = 4 — d. Each counterterm was chosen to kill off a divergence among 
the Feynman diagrams generated by our action. 


If we add the two pieces 4 and AY together, we arrive at the action defined 
in terms of the bare, infinite quantities: 


ie NS Zi 


1 
g WAS — av ALY — 86.f% Ajo Asod” AG” 
1 1 
ca geo £7 fe Ano AGo Ag" AS’ te ome . Aja . Ab 
: 0 


* 


. i wags ; 
fe 19nd" nb _ 580 FP Aono Oy np — 580 Frng" 198 ; AG 
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+ Woo + igoAroVoy" t7 Wo + imovowo (13712) 


Let us compare the equation on the left-hand side, which is defined in terms of 
the Z’s, with the right-hand side, which is defined in terms of the go’s. Setting the 
two sides equal, we find the relation between the multiplicative renormalization 
constants and the counterterms: 


8 = geh?Z)/Z2V/Z3; Yo = WVZow 

fo = eueemyz, ; “Say ey aA, 

8 = eet? /Z5/Zs; no = wWZon (13.13) 
8 = gHi?Z5//Z3Z6; mo = mZy,/Zo 

go = gu? Z5//Z3Z6; 


In principle, the various coupling constants do not have to be equal. In the 
original bare action, these coupling constants were, of course, all identical, but after 
renormalization there is no guarantee that these coupling constants will remain 
equal. In other words, there is the possibility that renormalization will destroy 
gauge invariance. If they are not equal, then gauge invariance is broken. Gauge 
invariance, therefore, demands that the various coupling constants be identical. 
This is where we need the Slavnov—Taylor identities, to guarantee that we can 
maintain gauge invariance during renormalization. 

The Slavnov—Taylor identities (the gauge generalization of the Ward—Takahashi 
identities) preserve gauge invariance and hence keep all the coupling constants 
equal: 


80 = 8) = 80 = 80 = 80. (13.14) 


Setting the coupling constants to be equal, we arrive at: 


Zi 24 _WZ5 _ 27 VZs 
oe ee Se? Se (ats) 
Zo Law ty Len Ze 
These identities are the gauge counterparts of the relation Z; = Z2 found in 
ordinary QED. 


To prove that a theory is renormalizable, it is necessary (but not sufficient) to 
show that, by power counting, we can cancel all potential divergences by adding 
counterterms into the action, which in turn gives us a simple renormalization 
of the physical parameters. To complete the proof, we must show that we can 
write a recursion relation that proves that all diagrams are finite to all orders 
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in perturbation theory. This recursion relation, in turn, must be able to handle 
overlapping divergences. 

To begin this inductive procedure, let us now show, to lowest order, that we 
can explicitly eliminate all divergences via this renormalization procedure. We 
will use the dimensional regularization approach, which is perhaps one of the 
most convenient regularization approaches for gauge theories since it respects 
the Ward—Takahashi identities. (The Pauli—Villars method, by contrast, violates 
gauge invariance for non-Abelian theories. To apply it to gauge theories, one must 
make a nontrivial generalization of this method involving higher derivatives.) 


13.2. Dimensional Regularization of Gauge Theory 


The task of demonstrating that all divergences at the first loop can be absorbed 
into a renormalization is simplified by repeating some of the calculations that we 
found in QED, except that we must include more diagrams with additional isospin 
indices. We will only analyze the fermion self-energy graph, the vertex correction, 
and the vector meson self-energy graph. The other divergences can be analyzed 
in a straightforward fashion. 

For example, the fermion self-energy diagram is identical to the QED electron 
self-energy diagram, except that we must add in the isospin indices: 


L(Y) = t°t* Xeep(P) (13.16) 


(We work in the Feynman gauge.) To calculate this, we must be more specific 
about the structure of the Lie algebra. In general, for Lie algebra generators 1? 
which are dy x dy matrices in some R representation of the algebra, we have: 


Trt?’ =C,e" (13.17) 


where Cp is called the Dynkin index of the representation R of the algebra. 

t“t* is a Casimir operator of the Lie algebra; that is, it commutes with all 
members of the Lie algebra. It can be chosen to be proportional to a dy x dy unit 
matrix times 57°. To calculate the coefficient, we contract over a: 


(Nee (13.18) 


where N is the number of generators in the algebra. 
Thus, we have: 


t’t? = —Cr (13.19) 
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Figure 13.2. The vertex correction for gauge theory has an additional graph not found in 
QED because of the three-boson interaction. 


summed over a. This then gives us: 


2 
ip) = i EL 80 _ (44m) (13.20) 
HES 


[Notice that the sign appearing in this equation differs from Eq. (7.94) because 
of our choice of Euclidean metric. Also, for SU(M), we have N = M? — 1 and 
ds = M. For the fundamental representation, we have Cy = a Likewise, 
the vertex correction graph resembles the vertex correction graph found in QED, 
except that there is an extra graph coming from the three-boson graph (Fig. 13.2). 

The first vertex correction graph is directly related to the QED result: 


Teac ct een (13-21) 
We use the fact that: 
corte? = [e? te? + 2’? 
= if Cree 7008 


N 
= ia a PY mo 


1 N 
= =e +e 13.22 
5 Cadt a ft ( ) 


where Cg is the Dynkin index in the adjoint representation of the group (the same 


representation as the generators) and equals M for SU(M). 
Our final result for the first vertex correction graph is then: 


2 
Tp gq) = —iggus?t?y ‘epee. mS (15723) 
Ds oe ae d; ) 8x7 
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Next, we must calculate the vertex correction piece coming from the three- 
boson graph that does not appear in QED: 


Tr? = —ghurlesoreabes f d*k y" 1 vy 
we 0 (2m)4 k—-m m 


yuk + P)o Bun + (q — 2p — kK)v8uo + (P+ q — 2k)yBov 
Oe ee ee) 
(BP — Gy ‘ 


We now contract over the isospin indices: 
i 
fet = 5 ae (13.25) 


and introduce Feynman parameters and integrate to zero any terms that are purely 
linear in momenta: 


7 d¢k 
3€/2 
Sie curt [ dx fay (Qr)4 


2h hy 
* [ke + m1 —x — y) + q?x + p?y — (qx — pyyYP 


(2) 
ry 


i 


~ighp tr rece ah ax [ és] — €/2)T(e/2) 


Art 2 e/2 
xX | eee een 
(=a —x—y)+q?x + p*y — (qx - =x) 


2 
ton yey, robalad , 
igou yt? Te + (13.26) 


HN 


The sum of the two contributions to the vertex correction gives us: 


2 
Ee? 8 N 
Dy = igou‘ Vacs Size (Cas +C- x) (13.27) 
86 N 
Fi = 1 = — Sac 
\ ee (corey) - (13.28) 


Last, we would like to calculate the vacuum polarization graph for the gauge 
field. There are, unfortunately, four graphs that must be computed (Fig. 13.3), 
only one of which can be read off from our QED calculations. 
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Figure 13.3. Of the four graphs contributing to the meson self-energy graph, only one has 
a counterpart in QED. 


That contribution to the vacuum polarization contains internal fermion lines. 
It is given by: 


1 Tr(r*r’ jie 


§28 8 2 
1677e 3 


(SuvP” ~ PuPr) (13.29) 


By a straightforward application of Feynman’s rules, we can also calculate 
the contribution in which gauge mesons circulate in the interior loop. We merely 
contract over two gauge meson vertices: 


T1‘)4>(p) pe 1 2 ye paca pode tk Vw (13.30) 
a Pan (274 k?(p +k? 
where: 
eee [(2k + P)w8pa — (k +2P)o8up +(P — Mp8y0] 


x [(2k + p)vg’? — (2p +k) 5h + (p — k)? 5) | 


(4d — 6)k, ky + (2d — 3)(ky pv + kv py) + (d — 6) py py 


+ [(p—k? + (2p +k) leu (13.31) 


Let us now introduce Feynman parameters into the calculation: 
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seat )= ge imi || ax f d*k 1 
pv \P)= (20)? [k2 + p2x(1 — x) 


x ((44 — 6)kyk, + (4d — 6)x(x — 1) +d — 6]p,p, 


+ {2k + p?[2x(x — 1) + 5}}8uv) 


2 1 
= SoH acd ¢bcd (3d — 3)8uvPO = d/2) 
~ Rant S f a ( [px — x)]'-4? 


(2 —d/2) 
[p?x(l =x) 4/2 


{ guvP*I5 — 2x(1 ~ x)] 


+ PyPrld —6 — (4d —6)x(1 — oi) 


a! 


= a 80 paced phed (F meen? mcs 5 Pa) ee (13.32) 


where we only keep the pole term and drop finite parts, and where we eliminate 
momentum integration over terms linear in the momentum. (We note that the 
finite parts to this integral contain infrared divergences.) 


Now we must also calculate the contribution to me coming from the ghost 
loop. We find: 


dtk (k + pk, 
(21)4 2k + py 


Wee (p) a gout fos ee 


d4k (k — px),[k+p(1 —x)] 
2) ereacderbcd Mm 
= Boe fr f i ax | (2n)4 [k2 + px(1 — x)/? 


Sa Sou me dx Sul (1 — d/2) 
(4x4? [p2x(1 — x)]}!-4/2 


_ 5 PuPox(1 — x2 ~ 4/2) 
oie 


’ 1 
- ff ie Z8uvP + 3Pur) + (13.33) 


where we drop all finite parts. 
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There is also the zero-momentum loop diagram (and also two tadpole graphs) 
that do not contribute anything at all. We know that they will give us mass 
corrections that do not have any momentum dependence, being proportional to 
57(0). However, we know that, by gauge invariance, the mass of the gluon is 
zero even after renormalization. Therefore, we can drop these potential mass 
corrections from our calculation. 

We summarize our final results for some of the renormalization constants: 


2 
80 N 
Tb = J]— aaa on 
1 =P (cu+cy7)+ 
2 
Some} 
Zam | =2 
(*, dre ng 
2 
8 5 4 
Zy. = eG Ce 
3 ary €é d 3 i)+ (13.34) 


which is consistent with Eq. (7.103). 

This now completes the first step in the induction process. Now, we must 
tackle the most difficult part of the program, which is to write down the recursion 
relations and show they are actually satisfied. 


13.3. BPHZ Renormalization 


The multiplicative renormalization procedures that we developed for QED are 
quite awkward when applied to gauge theories, since we have many more inter- 
action vertices and fields. We now present a different renormalization scheme, 
the BPHZ renormalization program,'~* which is one of the most powerful and 
versatile of the various renormalization programs. Although it has a reputation of 
being a formidable, difficult formalism, the essential features of this approach are 
easy to summarize. 

There are several important reasons for analyzing the BPHZ renormalization 
prescription: 


1. The BPHZ approach easily handles overlapping divergences, which are diffi- 
cult to manipulate in other formalisms. In fact, overlapping divergences are 
the chief complication in any renormalization program. 


2. It is independent of the regularization prescription, and hence may be used to 
show that renormalization theory is independent of the regularization scheme. 
Since we use a subtraction on the integrand of the Feynman integral, we never 
need to make any explicit mention of a regularization scheme. There is no 
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need to discuss the details of Feynman graph divergences. All we need to 
know is that a prescription exists to render a graph convergent. 


3. Although the Dyson renormalization program outlined earlier is ideally suited 
for multiplicative renormalization, the BPHZ formalism is more closely re- 
lated to the counterterm method. 


In the BPHZ formalism, we assume that the usual power-counting analysis has 
been performed, leaving us with the final induction step. We begin by first showing 
that we can, via a Taylor expansion at zero external momentum, eliminate the 
divergent quantities of any graph by a subtraction. This is called the Bogoliubov 
R operation. In our discussion of the BPHZ technique, we will derive an explicit 
expression for the subtractions. We will then show that this method of subtractions 
can be rewritten in terms of counterterms added to the action. 

In this section, we will first try to outline the intuitive ideas behind the BPHZ 
program, in order to stress the simplicity of its basic ideas, and then later we will 
be more precise in our definitions. (We omit detailed proofs.) 

We begin by defining the superficial degree of divergence of a graph as the 
degree of divergences given by power counting. We define a renormalization part 
as a proper (1PI) diagram that is superficially divergent. Let I be a particular 
Feynman graph to which we associate a Feynman integral: 


Fr 


lim, dk, --+dky Ip 


Ir 


[ [4rGe — 0) ] [ ve (13.35) 
a,b c 


where the integrand consists of a certain number of propagators and vertices. 
This graph, in general, is divergent as « — 0*. (Our results, however, will be 
independent of any particular regularization scheme.) 

We will now define the finite part of this graph, denoted by Jr: 


Jp = tim, ff dy «dln Rr (13.36) 


The goal of the BPHZ renormalization scheme is to find a prescription or a 
set of rules by which we can extract Rr from any Jp. We define a graph to be 
primitively divergent if it (1) is 1PI (one-particle irreducible), (2) is superficially 
divergent, (3) becomes convergent if any line is broken up. For these primitively 
divergent graphs, let us introduce an operator ¢’ that has the ability to extract out 
the divergent part of a graph via a Taylor expansion at zero momentum: Then: 


Jp = / dk, ---dkpd — tip (13.37) 
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To define this operator, we define a Taylor expansion about the point where 
all external momenta are set equal to zero. We define: 


Pip) = dO) + 


E-1 
D> (Pida(Padia + Pindav 


i Jp=1 


DT), 


x a as (13.38) 
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where E — | is the number of external lines and D is the superficial degree of 
divergence. The operator (1 — r") has a simple interpretation: It just subtracts the 
divergent part of an integral at zero momentum, with the number of subtractions 
determined by the superficial degree of divergence D. 

The more general case, however, is much more complicated than this because 
a graph I’ may have divergent subgraphs y;. In fact, a graph T' may be superficially 
convergent but may contain divergent subgraphs. The bulk of our work is to find 
a way in which to catalog and then subtract each of these divergent subgraphs. 
Because of the large number of definitions we must make, we will first intuitively 
sketch the outline of the BPHZ program, without regard to rigor, in order to display 
the essence of the technique. In the next section, we will be more precise in our 
definitions. 

Let us define Rp as the integrand of a graph with all subgraph divergences 
subtracted out. The only divergence left is therefore the overall divergence of the 
entire graph. Once we subtract out this overall divergence, then we are left with 
all divergences subtracted, so we have the renormalized integrand Rr: 


Rr = Ry —t' Rp (13.39) 


There are two equivalent approaches to finding the solution for Rr. Histori- 
cally, the first approach was pioneered by Bogoliubov and Parasiuk' and Hepp,” 
who wrote down a recursion relation for Rr in terms of lower-order graphs. In the 
second approach, Zimmerman? wrote down the explicit solution of these recursion 


relations for Rr. 
To understand both approaches, we first recall that divergent subgraphs y; can 
be one of three possible types. If we draw boxes around each subgraph, then these 


boxes are either 


1. Disjoint (the boxes are separated, with no common region). 
2. Nested (one box appears entirely within another). 


3. Overlapping (the boxes share some common lines). 
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Figure 13.4. The 6 ways we can draw boxes in the BPHZ approach, as shown here for the 
two-loop graph, avoid the overlapping divergence problem that is a major obstacle in other 
renormalization methods. 


One can, of course, construct Rr by simply subtracting off all possible subdi- 
vergences within /r. In Zimmerman’s approach, however, one omits the overlap- 
ping divergences among the subdivergences. The subtractions are taken only over 
nested and disjoint graphs. To see this, let U be any particular set of boxes. Let 
FY be the total set of all possible combinations of boxes. For example, in Figures 
13.4 and 13.5, we show how to draw boxes around the various subgraphs for a 
two-loop and three-loop diagram, such that we ignore all overlapping subgraphs. 


Et or rr es. ar 
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Figure 13.5. The 16 ways that boxes can be drawn for the three-loop case avoid overlap- 
ping divergences. 
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There are 6 ways in which to draw these boxes for the two-loop diagram 
(dropping overlapping combinations): 


SF = {0, Yi, 2 ¥3, ¥3V1> rare} (13.40) 
It is essential to notice that we have omitted the overlapping cases: {yy}, and 
{y3y2Vvi}. Symbolically, we may therefore write Rr as the usual Feynman integral 
minus the divergences associated with each of these subgraphs: 


Rp = [1 —t" — 2” — ¢% 4 (—#%)(-1") + (-1)(-1”)] Ip (13.41) 


The generalization to the three-loop case is straightforward. The decomposi- 
tion into boxes is given by: 


F = {0, Vi. 2, ¥3, V4. YS; 
Y2V1; ¥3V1, Y4¥2, Y5¥2, ¥5SV1, Y5¥3;5 


Y5V4, VSVIV2, Y5¥3V15 ysvays} (13.42) 
Then Ry is given by: 


Rr = [1 = tY! _ t” _ t% ua 1% —_ t% + (—t”)(—-t””) 


a er 2) 


+ 


(8817) + (7-1) + (1-2) 
(275) (07) + (0 (PY) + (0) | 
(13.43) 


What is remarkable is that this subtraction process works even if we simply 
drop the troublesome overlapping divergences. These terms, we recall, invalidated 
the naive multiplicative renormalization scheme of Dyson/Ward for QED, which 
broke down at the 14th-order level. So it is rather surprising that we can simply 
drop them in the BPHZ counterterm approach. 

To see why the overlapping divergences can be dropped in this approach, 
consider the two-loop case shown in Figure 13.4. A direct calculation of the 
double-loop graph shows that it contains divergences proportional to 1 / €*, which 
can be cancelled, as well as log p”/e. This second type of divergence is the 
celebrated overlapping divergence and cannot, at first glance, be cancelled by 
adding any counterterm to the action. A term like log a7 is required to cancel 
this diagram, and such a term does not appear in the action and hence cannot 
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be absorbed into any renormalization of a physical parameter. The method of 
counterterms seems, at first glance, to fail. 

Miraculously, however, such terms can, indeed, be cancelled if we take a 
closer look at the method of counterterms. A single-loop counterterm for ¢? can, 
of course, cancel the subdivergence in y;. When applied to the double-loop graph, 
this counterterm gives us 1/e multiplied by the loop integration of the rest of the 
graph, which produces log p*. The product of the two gives us log p*/e, which 
is the term needed to cancel the overlapping divergence. Thus, the single-loop 
counterterms, when applied to the double-loop graph, give us a product that can 
cancel the overlapping divergence log p?/e. 

In BPHZ language, this cancellation is written as: (1 — ¢%)t"'r” = 0. 

Thus, overlapping divergences, which are difficult to handle in the multiplica- 
tive renormalization scheme, can be cancelled by carefully iterating lower-order 
counterterms for higher-order graphs. This demonstrates the superiority of the 
counterterm method over the multiplicative renormalization. 

In the same manner, one can show that all overlapping divergences drop out 
to all orders, although we will not present the proof. BPHZ showed that this 
cancellation can be generalized for an arbitrary number of t” even if the y; are 
overlapping. 

Although this result is gratifying, there is still one last step that we must 
complete. Zimmerman’s solution, although explicit, still has one serious disad- 
vantage. It contains nested graphs, which cannot be cancelled by the counterterm 
method. This is because counterterms in the action only cancel against disjoint 
graphs, never against nested graphs. (A simple application of Wick’s theorem and 
Feynman’s rules for the counterterms shows that nested subdivergences are never 
generated.) 

To make contact with the counterterm method, we will now use an equivalent 
method pioneered by BPH, which is equivalent to the Zimmerman solution. It is 
possible to absorb all unwanted nested graphs into purely disjoint graphs (which 
can be cancelled against counterterms) if we write down recursion relations for 
lower-order subgraphs. 

For example, in Figure 13.5, we notice that a nested graph arises from 73 and 
yi. This nested divergence can be absorbed by introducing a new subtraction 
operator R,, which operates on subdivergences: Ry, /p = Ip +(—t)” Ip, where R,, 
is an operator that subtracts out the divergences contained within the subgraph y3, 
which is due to the subgraph y;. Therefore, the nested graph can be absorbed by 
introducing this subtraction operator for subgraphs: 


(—t)” Ry, Ip = (—1)” Ip + (-1)"(— 1)” Ip (13.44) 


The last term is the nested graph, which has now been absorbed into the 
operator R,3. This method is quite general: All nested graphs can be absorbed 
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into the subtraction operator of some subgraph. BPHZ proved that this process 
allows one to express Ry by iterating R, for lower-order disjoint subgraphs and 
dropping all nested ones. 

Let now summarize both the Zimmerman and the BPH formalism. Let y be a 
divergent subgraph. Let ‘” be the set of all possible combinations of just disjoint 
subgraphs and  U. / ° be the set of both disjoint and nested graphs. Then the 
formulas of Zimmerman and BPHZ, respectively, can be written symbolically as: 


RrIp = P| if <n") Ir 


(13.45) 
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For example, for Figure 13.5, the set & is given by just the disjoint set 
{0, V1, ¥2, ¥3, ¥4, ¥1¥2}- By expanding out all the terms in the BPH recursion 
relation on the second line, we recover Zimmerman’s formula on the first line. 

The advantage of the BPH recursion relation is that we sum solely over 
divergent disjoint graphs, which in turn can be cancelled against the counterterms 
appearing in the action. The recursion relation is then the last step in demonstrating 
that the BPHZ method guarantees that counterterms in the action can cancel against 
all potential divergences of field theory. 


13.4 Forests and Skeletons 


So far, our discussion has tried to emphasize the intuitive nature of this BPHZ 
approach, which is a specific prescription by which to subtract out all possible 
divergent subgraphs. This intuitive discussion, however, will now be repeated 
and strengthened by making a few rigorous definitions. Specifically, these def- 
initions will allow us to show the equivalence of BPH’s recursion formula and 
Zimmerman’s explicit solution. 

Let y be a subgraph within a graph I’. Two graphs are mutually disjoint if 
they have no lines or vertices in common: 


NAY =9 (13.46) 
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Now define {7;, y2, ..., Yn} to be a set of mutually disjoint connected subdi- 
agrams of the same graph I’. Then we define the reduction operation: 


18 


oe (13.47) 
{vi, osx ap 


which contracts each subgraph y; down to a point. 
We say that two subgraphs overlap if they share some lines and vertices. More 
precisely, they overlap if none of the following holds: 


nyONn=% nCns wCn (13.48) 


Both overlapping and nested graphs are omitted in Eq. (13.47). 
Now we come to the definition of a forest (which includes nested graphs). A 
forest U of T is a hierarchy of subdiagrams such that: 


1. The elements of U are all renormalization parts. 
2. Any two elements of U are nonoverlapping. 
(Loosely speaking, as we saw before, a forest U is a set of subgraphs that can 
be either nested or disjoint, but not overlapping. Each subgraph is superficially 
divergent. For example, there are 16 forests in Figure 13.5.) 
A forest is called full if it contains T° itself. And it is called normal if it does 
not. A forest is called empty if it contains only the null set. 


To define this subtraction scheme, we introduce the Bogoliubov R operation. 
Then BPH proved that Ry can be expressed recursively as: 


Rr=int+ SO dyn. | [ Or, (13.49) 
I aa t=l 


where we define: 
O, =-t” Ry (13.50) 
Then Rp can now be defined as follows: 


Ry = Ry if f = renorm. part 


Rr (1 —t")Rp if P ¥ renorm. part (13.51) 


Notice that this definition of the R operation is recursive and that we only 


subtract disjoint graphs. Ry, is always defined in terms of Ry of lower order. 
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The remarkable thing about this procedure is that it is equivalent to Zimmerman’s 
forest formula: 


Rr = yy [ [Cor (13.52) 


allU AGU 


where we now subtract both disjoint and nested graphs, and where the product 
over i are ordered, such that t* is to the left of f° ifA Do. 

We now sketch the proof that Zimmerman’s Eq. (13.52) satisfies the recursive 
BPH definition of Rr in Eq. (13.49). We can always find the unique set of biggest 
disjoint subgraphs M,, M2,..., M, of any forest U. Each biggest subgraph M; 
contained within a forest U may have smaller nested subgraphs contained within 
it. To construct this unique set, we take any two nonoverlapping subgraphs y; and 
y; within the forest U. Then we must have one of the three possibilities: 


Vi Gye 
Vine 
Yj fa) Yj = f) ( 133553) 


For the first possibility, we remove y; as a candidate for a biggest subgraph. For 
the second possibility, we remove y; from consideration. For the last possibility, 
we leave both in. By successively eliminating the various subgraphs in this way, 
we are left with only the biggest subgraphs {M;}, which are disjoint and unique. 

The forest U is then the union of full forests, one foreach M;. We can therefore 
rewrite Zimmerman’s forest formula in Eq. (13.52) as: 


Rp = Ip+ », [fc om ( De 


er U;E¥(M)) 
x = Ton Teo) (13.54) 
U,€F(M,) nWEU1 yn EU, 


For example, consider Figure 13.5. The set of disjoint biggest subgraphs is 
{M;} = {0, v1, 72, ¥3. ¥4, V1 ¥2}. Then the terms farthest to the right contain the 
nested combinations {y371, yay2}. In this way, Eq. (13.54) separates the forests 
into two sets: the disjoint set {M; } and the nested set. 

The point of this construction is that we have rewritten the forest formula so 
that all nested sequences of graphs appear within the parenthesis. This allows 
us to regroup these nested formulas into the form (—t)” Ry, Ip. Since {M;} 
is the unique set of disjoint biggest subgraphs within any forest, we have now 
converted sequences of nested subgraphs into a recursion relation involving only 
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these biggest subgraphs. The nested graphs in Figure 13.5 have not disappeared; 
they have simply been hidden within the M;. 

With this regrouping of graphs, Eq. (13.54) has now been converted to the 
expression )-[],(—1) Ry, Ip. But written in this way, we recover the BPH 
formula of Eq. (13.49), based entirely on disjoint graphs M;. This completes 
the sketch that Zimmerman’s forest formula in Eq. (13.52) (based on nested and 
disjoint graphs) can be reexpressed in terms of the BPH recursion formula in Eq. 
(13.49) (based only on disjoint graphs M;). This demonstrates the equivalence of 
the two formulas given earlier in Eq. (13.45). 

Now that we have rendered all graphs finite, the last step in the proof of BPHZ 
renormalization is to show that this subtraction technique can be accomplished by 
adding counterterms into the action. This is easy, since the subtraction process 
on disjoint graphs that we have outlined is equivalent to the process of adding 
counterterms into the action. Since the counterterms correspond to the set of 
divergent disjoint graphs, the procedure of subtracting off the divergences is 
identical to adding counterterms into the action. Since we saw earlier that these 
counterterms are proportional to the original action, we have now demonstrated 
that the BPHZ method is equivalent to multiplicative renormalization. 

The Yang-Mills theory, because it satisfies all the properties required by 
BPHZ, is therefore renormalizable. Not only does the Yang-Mills theory satisfy 
all the requirements coming from power counting, it also satisfies all the properties 
demanded by the BPHZ recursion method. (Since the BPHZ method makes no 
mention of gauge invariance, we must also impose the additional constraint of the 
Slavnov—Taylor identities to keep the renormalized coupling constants for gauge 
theory equal.) 

Finally, it is useful to compare the BPHZ method with the Dyson renormal- 
ization program mentioned earlier. In retrospect, there are some key differences 
between these two approaches. The Dyson renormalization program was based on 
defining skeleton graphs constructed out of renormalized vertices and self-energy 
graphs, such as $ r- The Dyson approach from the very beginning tried to lump 
infinite classes of divergences into these renormalized vertices and self-energy 
graphs. The advantage of doing this is, of course, that one can immediately ex- 
tract out the multiplicative renormalization constants Z;. However, the price we 
paid for grouping the graphs from the very start into renormalized propagators 
and vertices was that we were plagued with overlapping divergences. Thus, the 
recursion relations had to be written out entirely in terms of vertices without the 
overlapping divergences, which often gave us clumsy equations. Another disad- 
vantage of the Dyson approach is that it was not very general. It was constructed 
explicitly for QED, and hence must be modified in significant ways to handle 
more general theories. 

This, however, is precisely the advantage of the BPHZ method: It is quite 
general. The BPHZ approach abandons the skeleton method of trying to lump 
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divergent graphs from the very beginning into S$’, etc. The BPHZ approach is 
based on successively adding counterterms to the action. These counterterms are 
chosen to subtract out the divergent integrand of any Feynman diagram, without 
performing any regrouping of diagrams into renormalized vertices and propaga- 
tors. As a result, we lose multiplicative renormalization at each intermediate step. 
However, the advantage of this is that we are no longer plagued by overlapping 
divergences. Only at the last step do we recognize that these subtractions give us 
counterterms that are proportional to terms in the original action, which in turn 
finally gives us multiplicative renormalization. 

(We should also point out some drawbacks of the BPHZ method. Because we 
subtracted all diagrams at zero momentum, infrared divergences are more difficult 
to handle in this approach. Also, the method must be modified to handle gauge 
invariances, since the Slavnov—Taylor identity must be added as an additional 
constraint.) 


13.5 Does Quantum Field Theory Really Exist? 


Because of the remarkable experimental success of quantum field theory in de- 
scribing the interactions of electrons and photons, we might be surprised to find 
that, strictly speaking, quantum field theory as a perturbation theory may not exist. 
This is because although we can successfully renormalize the perturbation series, 
there exists the possibility that the entire perturbation theory diverges. Simple 
arguments, in fact, show that perturbative quantum field theory may likely di- 
verge at extremely high order. Although the perturbation theory for QED seems 
to converge rapidly at low orders because a ~ 1/137, eventually the Feynman 
graphs themselves may overwhelm the smallness of the fine structure constant 
and yield a divergent sum. 

For example, Dyson pointed out many years ago that for negative a, QED 
should be unstable, with unlimited virtual pair production from the vacuum. 
However, virtual pair production with sufficiently small separation may become 
real pair production by separating to larger distances. Thus, real pair production 
from the vacuum could progress unimpeded, and the theory could collapse with 
an unstable vacuum. Thus, QED may have a zero radius of convergence in a 
space. 

To see how the sum of a perturbation might diverge, let us take the much 
simpler example of ¢* theory without any kinetic term, and let us replace a 
functional integral over ¢ with an ordinary integral. Already, at this simple level, 
we can see how the perturbation theory, although perfectly well behaved at any 
finite order, diverges at infinite order. 
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Let us examine the behavior of the following partition function at high order: 
Z(g) = att fl * 9-# 12-80 ag (13.55) 
V 250 —0o 


This function is interesting because the coefficients in the expansion in g equal 
the number of vacuum diagrams in ¢* theory. Although this integral cannot be 
performed exactly, we can always power expand this function in powers of g and 
then try to sum the perturbation theory. A simple power expansion yields: 


Za) =e 2 
n=0 
Gly as on" —$ eo yy alii) 


Our goal is to examine the behavior of the perturbation theory for large n. We can 
use the Stirling approximation formula: 


nt ~ V2nne"en— (13°57) 
For large n, the perturbation theory therefore behaves as: 


ee (516), e(?—1/2)logn—n (13.58) 
fia 


Although this simple example is unrealistic, we can already see the nontrivial 
behavior of the theory in g space. The perturbation theory diverges with large n. 

In fact, a more careful analysis shows that the theory, in complex g space, has 
an essential singularity at g = 0. For any negative g, the integral over @ blows up 
and the theory breaks down. The potential is no longer bounded from below and 
the integral diverges. Thus, there is ample reason to believe that QED may suffer 
the same fate. 

The tremendous experimental accuracy of the theory, however, shows us that 
QED cannot be simply discarded as a physical theory just because the perturbation 
theory may not converge. QED has been able to withstand all challenges over the 
last 6 decades, and, not surprisingly, there is a resolution to this problem. 

We can consider QED to be an asymptotic theory, that is, a theory that can, 
for fixed n and @ small enough, approach a definite result. For example, in our 
simple example, we may treat the perturbation series as an asymptotic series: 


417 (2n + 4m iT(2n + 3) \g\"*} 
Pw 22 < Rash aes OM 
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For our purposes, we may consider QED to be an asymptotic theory that 
will allow us to obtain perfectly convergent results, even though the original 
perturbation series, in principle, may not exist. One may also approach this 
problem from another direction. One may be able to generalize the definition of 
the original divergent function Z(g) even if the perturbation theory was divergent. 
To do this, we will use the method of the Borel transform, which allows us to 
extract meaningful information from divergent series. For example, let us begin 
with a function G(g) whose power expansion in g diverges: 


Cais Yo ang” Hee (13.60) 


n=l 


Although the original power expansion of G(g) makes no sense, it is possible 
to define a new power series that has much better convergence properties. To 
see this, let us divide each coefficient by n! in order to obtain a more convergent 
series: 


Fo — g" (13.61) 
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Although the original power expansion diverged, this new function has a radius 
of convergence given by: 


1 . Qn I/n 
— = lim sup|— 
R n—-0o n! 


(13.62) 


With this new function, we can reintroduce the original function G(g) by defining 
it to be: 


G(g) = iL e ‘'F(tg)dt (13.63) 
0 


This new definition of G(g) reduces to the old one if we perform the integration 
over dt: 


G(g) 
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Although the original power expansion diverged, the advantage of this new defi- 
nition of G(g) is that it may have a finite radius of convergence, while the old one 
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was given by: 


pee = lim sup lan | Ca (13.65) 

Ry n-00 
Thus, if R> > 0, then R; = co. Similarly, if there is a singularity in the Borel 
plane for F(g), then R2 = 0. 

Now let us use this technique to analyze the Borel transform for the function 
Z(g). The key to this method is to define a new function B(f) that is constructed 
from the same coefficients Z, found in the divergent series except that we divide 
each term by new factors sufficient to make the series converge. Then we take the 
inverse Borel transform in order to recover Z(g) from B(t¢). 

For example, we can define: 


Late 
B(t)= » Ta+d (13.66) 


Because we have divided each term by the ' function, the series may now converge 
in a finite radius in ¢ space. 

Now that we have defined a function B(r) that exists, then we define the inverse 
transform to recover Z(g): 


Z(g) = [ dte—'/tB(gt) (13.67) 
0 


If this process of recovering the function Z(g) from its divergent perturbation 
series exists, then we say that the theory is Borel summable. 

Now let us analyze quantum field theories that might be Borel summable, even 
if the original perturbation theory diverges. We would like to analyze theories 
more realistic than this toy model that we have been studying. Our starting point 
will be the usual N-point Green’s function, but defined in Euclidean space: 


(O|T [6(1)O(22) - + @@xw)] |0) 
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where $(x) is a generic field for an arbitrary field theory of arbitrary spin. Our 
task is to take the Borel transform of this function in order to find when the Borel 
transform diverges. 
To analyze this Green’s function, we will rewrite the numerator of this function 
as: 


N(g) =f e 'F(gt)dt (13.69) 
0 
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where: 


FG) — | Do 5 (z — S(p)) b(x1)b(2) - -- PCN) (13.70) 


(To prove the equivalence of this expression with the original numerator of 
the Green’s function, simply insert the expression for F into N and perform the 
integration over f, which is trivial because of the delta function.) 

We recognize F to be the Borel transform. In order to analyze the singularities 
of the transform, it is helpful to analyze the singularities of a much simpler 
expression. We would like to analyze the singularities of the following function: 


fue dug +-dun 6(z — f (uy, U2,..., UN)) 
= fazer (13.71) 
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where © is a hypersurface in {uw} space and where f(u) = z. (To prove this 
identity, perform the integration over, say, u,. Then invert this implicit function, 
and rewrite the expression in a more symmetric fashion.) 

This function obviously diverges if there is a point where: 


- of |? 
Iv F| lee =0 (13.72) 


Then the denominator blows up, and the function becomes singular. 

Now replace u; with @(x;) and f(u) with S(¢). Then, if we perform the 
functional integral over ¢, we find that the resulting integral is singular if, for 
some S(@) = z, there is a point satisfying the usual equations of motion: 
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In summary, we have shown that the Borel transform F blows up if there is 
a solution to the Euclidean equations of motion where the action S(@) is finite. 
These finite-action, Euclidean solutions spoil Borel summability. 

Unfortunately, such finite-action solutions to the Euclidean equations of mo- 
tion actually exist. They are called instantons, and represent genuine solutions to 
the gauge theory with Euclidean metric. Instantons will be discussed in greater 
depth in Chapter 16, where they will play a key role in our understanding of the 
stability of the vacuum. Thus the perturbation theory of gauge theory is neither 
convergent nor is it Borel summable. QCD, for example, has zero radius of 
convergence. We must, as a consequence, treat it strictly as an asymptotic theory. 
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This also has practical implications for QCD. There are, as we have noted, 
an infinite number of possible renormalization schemes. Usually, we say that the 
sum of the perturbation series is independent of whichever scheme we choose. 
However, in actual practice, as we have seen, the various subtraction schemes 
have different convergence properties. 

In summary, the divergences of gauge theory are only a bit more compli- 
cated than those of QED. In both cases, power counting arguments show that the 
divergences of a graph are functions of the number of external lines, and these 
divergences can be absorbed into a renormalization of the physical parameters. 

We have also seen that the BPHZ method gives us a powerful method of 
renormalizing quantum field theories, including gauge theory. The advantage 
of the BPHZ method is that overlapping divergences, which give rise to severe 
complications for the Dyson approach, do not have to be treated separately. The 
BPHZ method also gives us a simple formalism in which to handle counterterms. 
No explicit regularization is needed. 

In the next chapter, we will use renormalization theory to give us perhaps the 
most important experimental verification of QCD. 


13.6 Exercises 


1. Draw all the Feynman graphs in gauge theory with fermions necessary to 
calculate Z4, Zs, Z6, Z7, and Zg, to one-loop order. 


2. Consider a g? theory. Consider (a) a four-loop diagram with the topology of a 
ladder with five rungs; (b) a four-loop self-energy graph, consisting of a circle 
containing three interior parallel vertical lines, with two external lines coming 
out from the left and nght. Break them both down in terms of a skeleton and 
a forest decomposition. 


3. Couple SU(N) Yang-Mills theory to a Yukawa theory of mesons with quartic 
interactions. By power counting, find all primitively divergent graphs includ- 
ing ghosts. Show which graphs correspond to the renormalization of which 
physical parameters. 


4. For the Yang—Mills theory coupled to Yukawa mesons, write down the coun- 
terterms that must be added to the action to renormalize it. Find the relations 
between the various Z; that are preserved by the Slavnov—Taylor identity. 


5. From Feynman’s rules for this same theory, setup the dimensionally regulated 
integrals necessary to compute the scalar meson self-energy diagram and 
scalar—scalar—vector meson vertex to lowest order. Do not solve. 
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6. Consider the Slavnov—Taylor identity to one loop order in gauge theory cou- 


pled to fermions. Prove two of the relations appearing in Eq. (13.15) to that 
order. 


7. Beginning with Feynman’s rules, fill in the missing steps in Eqs. (13.24) and 
(13.26). 


8. Beginning with Feynman’s rules, fill in the missing steps in Eqs. (13.30) and 
(13332): 


9. Prove Eqs. (13.56) and (13.58). 


10. Prove that Eq. (13.43) in Zimmerman’s approach can be re-expressed as a 
recursion relation, as in Eq. (13.49) in BPH’s approach. 


Chapter 14 


QCD and the 
Renormalization Group 


There's a long tradition in theoretical physics, which by no means affected 
everyone but certainly affected me, that said the strong interactions are 
too complicated for the human mind. 

— S. Weinberg 


14.1 Deep Inelastic Scattering 


One of the great theoretical breakthroughs in gauge theory was the realization 
that the renormalization theory of gauge theories may explain many of the curious 
features found in deep inelastic scattering. In fact, it was the remarkable success 
of gauge theory in explaining the Stanford Linear Accelerator Center (SLAC) 
experiments on electron—proton scattering that helped to elevate QCD into the 
leading theory of the strong interactions. At very high energies, the form factors 
begin to lose some of their dependence on certain low-energy dimensional pa- 
rameters for |g|*> > 2GeV?. This phenomenon is called scaling. For the deep 
inelastic scattering experiments at SLAC, where a high energy beam of electrons 
was scattered off a proton target, Bjorken! predicted that scaling should occur, 
(using current algebra, Regge asymptotics, and kinematics). 

The deep inelastic scattering amplitude was calculated for the process (Fig. 


14.1): 
e +p—e +anything (14.1) 


for large momentum transfers of the electron. This was an ideal experiment to 
analyze the structure of the proton, since the probe was an off-shell photon, which 
has a relatively clean interaction with the hadrons. 
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i) 


Figure 14.1. Deep inelastic scattering: in electron-proton scattering, an off-shell photon 
probes the structure of the proton. 


The simplest explanation of scaling comes from Feynman’s parton model, 
where the proton is assumed to consist of point-like constituents.27 Remarkably, 
such a simple picture explained many of the qualitative features of the SLAC 
experiments, including scaling. 

There was a puzzle, however. If the proton was a bound state of some mys- 
terious force, then presumably nonperturbative effects were dominant. However, 
the parton model indicated that, at high energies, the partons (e.g., quarks) could 
be considered to act like free point-like particles. Apparently, nonperturbative 
effects could somehow be neglected, and we could assume the quarks were free 
to roam inside the proton. 

This simple experimental picture was then explained through QCD. Using 
the theory of the renormalization group, it could be shown that the renormalized 
coupling constant varied with the energy scale. At increasingly high energies, 
the coupling constant of the strong force became smaller and smaller, so that the 
quarks could be treated as if they were free point-like particles in the asymptotic 
domain. This effect was called asymptotic freedom. A general analysis revealed 
that non-Abelian gauge theories were the only field theories in which asymptotic 
freedom was exhibited. 

The flip side of asymptotic freedom was that, at smaller and smaller energies, 
the coupling constant became increasingly large. This could, in principle, explain 
why the quarks were permanently confined within the hadrons. 

Let us explain the development of asymptotic freedom by first giving the 
experimenta] results at SLAC on scaling, and then continue our discussion of 
renormalization theory and the renormalization group, leading up to the celebrated 
result that non-Abelian gauge theories are asymptotically free. 

We will close this chapter by showing that the renormalization group equations 
give us a recursion relation that yields yet another method of renormalizing field 
theory. 

We begin by defining the kinematics of electron—proton deep inelastic scatter- 
ing. Let the incoming electron have momentum &, and the outgoing electron have 
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momentum k’. Then we define: 


gq = k—-k’ 
i Spee 
M 
2 
q 
= ay (14.2) 


The SLAC experiments probed the interior of the proton with a photon that 
was very much off-shell, (i.e., ¢2 —- —oo). 
In the lab frame, where the proton is at rest, we have the following: 


Pu =(M,0,0,0); ky =(E,kK); ki, =(E',&’) (14.3) 
Therefore, in the limit of small electron mass, we have: 


v= EF 
a —4E E’ sin?(9/2) <0 (14.4) 


where 6 is the scattering angle. 
We will be interested in the deep inelastic region, which is defined by: 


v + © 
Deep inelasticregion = 4 —q* —+ 00 (14.5) 
x — fixed 


We can show that 0 < x < 1. (This parameter measures how far we are from 
elastic scattering, which corresponds to the point x = 1.) 

Using Feynman’s rules, let us construct the scattering amplitude of an electron 
colliding with a proton of polarization o , emitting some unknown state |n): 


1 
My, = [e7m(k’, s)y*ulk, s)] (=) [(n|J.(0)|p, o)| (14.6) 


where J,, is the electromagnetic current, and the matrix element of this current 
between hadronic states is unknown. 

Using the standard rules for constructing differential cross sections, we find 
that the scattering into the nth final state is given by: 


aq, = some —— a 
: \v[2M2E(2x)32k/, 14 (27)2pi0 


462 QCD and the Renormalization Group 


>> | Hn|"(210)*84(p +k — k! — pp) (14.7) 


oe 


where p,, is the sum of the momenta of the various hadronic final states. 
Now let us sum over all the hadronic final states n, and we obtain the inclusive 
cross section: 


do EN 
—— :(4 : He Wav (14.8) 


where the leptonic tensor /,,, is given by (4y,u)(uy,u): 


gq 
T eu) (14.9) 


luv = sit (Kyu in)= (ki ky +k, ki + 5} 


The hadronic tensor W,,, is the object we wish to study, since it is basically 
unknown. It can be vastly simplified, however, by explicitly performing the sum 
over the unknown final state |v). Using completeness arguments, dependence on 
\n) disappears: 


u dBi 
Wu = 4M Lee) | I] (4 - | 
x (p, o|J,,(0)|n) (n|J,(0)|p, 0)(22)°5*(pn — p — @) 


3 ne (py, a |[J(x), JoO)I|p. 2) (14.10) 


[In the last step, we have written the product of two currents as a commutator. 
We have dropped the term J,(x)J,,(0) because it occurs with a momentum con- 
straint p, = q — p. In the lab frame, this means that F,, = M — v, which cannot 
be satisfied. ] 

We know from current conservation that 0, /% = 0, or: 


q Woir= 2g = 0 (14.11) 


Thus, using general invariance arguments, we can re-express W,,,, in terms of only 
two form factors W, and W>: 


a Iu dv P-q P:-q\ Wo 
Way = (S10 - a | Wi + «(p, a wet) @ = Pt) iva (14.12) 
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Inserting this expression back into the differential cross section, we find: 


do  4na? E' Beate) 2 {9 
ee, * Ty |W: cos (5) + 2W; sin (5) (14.13) 


Experimentally, it was discovered that, in the deep inelastic region, the depen- 
dence on q” and v was replaced by the dependence on x = —g?/2Mv alone in the 
structure functions: 


MW\(q’,v) > Fi(x) 
vW2(q?,v) — Fy(x) (14.14) 


This relation is called Bjorken scaling. 


14.2. Parton Model 


The most intuitive explanation of the scaling relations came from the parton model. 
The parton model simply assumed that the dominant contribution to the hadronic 
tensor W,,, came from the scattering of point-like constituents within the proton of 
unknown spin. It was a very naive picture of the proton, but it worked surprisingly 
well. In fact, it became a central mystery as to why such a naive model worked 
so well, far beyond its hypothetical range of validity. 

The essence of the parton model can be summarized in Figure 14.2, where 
the dominant contribution to the hadronic tensor comes from the scattering of the 
off-shell photon with a parton. 


Pp 


Figure 14.2. The parton model: an off-shell photon scatters off a point-like constituent of 
the proton. Comparing the resulting sum rules with experiment shows that the parton has 
spin + and most likely corresponds to a quark. 
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We assume that the parton has negligible transverse momentum with respect 
to the proton, so the parton momentum is in the same direction as the proton 
momentum; that is, the parton has momentum &p,, where 0 < € < 1. 

As one might suspect, the secret of the parton model’s ability to explain scaling 
lies in the kinematics of Figure 14.2. To see how scaling emerges from this simple 
picture, notice that momentum conservation forces us to have: 


p’=§p+q (14.15) 
Now square both sides of this equation. We arrive at: 
p” =&7 p? +2Mvé +q7 (14.16) 
In the scaling region, where p* and p” can be neglected, we have: 
q? +2Mvé ~0 (14.17) 
In other words: 
E=x (14.18) 


This is important, because it means that all structure functions will become func- 
tions of € or x alone. This, of course, is the essence of scaling. Thus, a very 
simple kinematic picture of partons yields scaling behavior. 

The naive parton model tells us more. From Eq. (14.15), one concludes that 
x is the fraction of the momentum carried by the parton in the nucleon. For a 
given spin, it allows us to calculate restrictions on W; and W>. By checking 
these structure constants against experiment, one can therefore determine the spin 
of the parton. To see how the spin of the parton is determined, we note that, 
in this approximation, the matrix element (€p, o|J,,(0)|p’, o’) is proportional to 
au(Ep)y,u(p’) for spin 5 partons. Thus, the contribution to the hadronic tensor 
coming from a parton of momentum &p is given by: 


1 


5(p, — = 
ae L(G Prr.n(e’)) [20e" routgpy] PP — 


Kyy(&) = 
spins 2Po 


(14.19) 


The total hadronic tensor is given by integrating this over all €. Let the 
number of partons of momentum Ep be proportional to some unknown function 
f(&). Then the total hadronic current from all partons is given by: 


1 
Woe [ FE)Ky(€) dé (14.20) 
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Let us now calculate all sums appearing in the scattering amplitude for the 
partons. The sum over the spinors is easily calculated using the usual rules: 


ie 
>> HED yup + gap + gyulEp) 


spins 


(€/2)Tr [PvE + dr] 


4&? p, py — 2MvEgyy +°-° (14.21) 


Now comes the crucial step. We will rewrite the delta function over momenta to 
explicitly display the fact that = x: 
5(po — &Po — 9o)/2po 
= 6(po)d[p” — (Ep +q)”) 
= O(Epo + qo)5(2Mvé + q”) 
= O(§po + qo)d(E — x)/(2Mv) (14.22) 


It is important to note that we have generated the factor 6(€ — x) from kinematic 
arguments alone. Now inset everything back into the hadronic tensor. The 
integral over & is now trivial to perform, and we arrive at: 


Wav = PuPv (4?) — Suv Ga shy oe (14.23) 


Now let us compare this tensor with Eq. (14.12). We find that we have now 
derived: 


1 
MW, > Fi@)= sf) 
vW. — F(x) =xf(x) (14.24) 
Not only have we established scaling, we have also derived the simple relation: 
2x F\(x) = Fo(x) (14.25) 


which is the Callan—Gross relation.* 
The usefulness of the parton model is that we can compare the scaling behavior 


of W,.2 against the various predictions for spin-0 and spin-5 partons. For example, 
for spin-0 partons, general invariance arguments show that we have: 


(xp|Julxp +9) ~ (2xp+q)u (14.26) 
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Therefore: 
Wyuv ~ (2xp + q)u(2xp + q)v (14.27) 


Comparing this with the previous expression for the hadronic tensor in Eq. (14.12), 
we find: 


Spin0: F,(x)=0 (14.28) 


Experimentally, the Callan—Gross relation is reasonably satisfied, while the 
spin-0 parton relation is not. This gives us confidence that the partons are, in fact, 
just the quarks. 

Next, we want to calculate the form factors F),2 in terms of the various quark 
constituents contained within the nucleon. In the naive quark model, as well as 
in QCD, the electromagnetic current appearing in the scattering amplitude was 
given by: 


2 Le a 
Ju = zy = za%ud — 35%uS (14.29) 


since the charges of the quarks are given by 2/3, —1/3, —1/3. 

Each piece of the electromagnetic current, given by the respective quark fields, 
contributes to the structure function, which is now the sum of the squares of the 
various contributions from each quark. Let us now separate out each individual 
contribution of each quark current to F,. Since F{ is written in terms of the square 
of the current, it can be written as the sum over the square of the quark charge 
times the individual distribution function: 


2Fi(x)= D> OQ? fgi(x) + 4i(x)] (14.30) 


i=u.d.s 


where the charge of the quark is given by Q; and q;(x) is the distribution function 
for the 7th quark. 
Then we have: 


é 4 a il = 1 » 
2 gp +p) + 9 4p + ap) + 9 (Sp + Sp) 


4 1 2 1 
2 5 un +ii,) + gen +d,)+ 9 (sn +5,) (14.31) 
where we have used the symbol u p(x), etc. to represent the u-quark contribution to 
the structure function for e+ p scattering. These functions represent the probability 
of finding a quark-parton with x fraction of longitudinal momentum for the given 
process. The coefficient appearing before each quark contribution is nothing but 
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the square of the quark charge. Let us assume SU(2) isospin symmetry holds. 
Then the u, parton distribution function equals the isospin partner d,. Isospin 
invariance gives us the equivalence: 


Up = d,, se 
d, = unuzed 
Sp =" SS s (14.32) 


We will now drop the subscript p on the proton quark distribution functions. 
Then we can write: 


Fy?(x) * 4(u + i) + (d +d) +(s +5) 


ae = = 14.33 
Fix) (ut+i)+4(d+d)+(s +5) ( ) 
Therefore we have the constraint: 
l eee) 
~<—4 14.34 
4 Ft) ( ) 
which agrees with the data. 
14.3. Neutrino Sum Rules 
Next, we would like to study neutrino—nucleon inclusive reactions: 
v+N—e +anything (14.35) 


which resemble the electron—nucleon inclusive reactions except that we use dif- 
ferent currents within the Hamiltonian, and we have more invariant tensors in the 
decomposition of the transition function. 

For neutrino scattering, the hadronic current is given by Eq. (11.105): 


pag = uy" (1 — ys) (d cos Oc + sin Oc) (14.36) 


and the leptonic part is given by: 


He 


fept = PY (L — ysle + uy" (l — ys)m +> (14.37) 
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Once again, the cross sections can be expressed in terms of various structure 
functions W;: 


Ww = a >| EX gi0*(p,5| [J SOP. 5) 
= —Wigyy + WoppPv/M? — iWeyvipp*q? /M? 
+ Wa quqv/M? + Ws(puqv + Pvqv)/M? 
+ iWe(Duqv — Pvqu)/M? (14.38) 


where, because of the nature of weak interactions, we have more possible tensors 
in the decomposition. 
Then the cross section can be written in terms of these structure functions as: 


d2a"-” GEE” a0) 0 2 0 
= = — | W. 
qQdE’ an) 2 sin (5) W, + cos (5) 2 
E+E’ 0 
pote sin? (5) m5 (14.39) 


where the — (+) sign corresponds to v (i) scattering. 
In the Bjorken scaling limit, we find: 


MW\(q’*,v) — Fi(x) 
vW2(q*,v) > F(x) 


vW3(q7,v) — F3(x) (14.40) 


where the neutrino scattering amplitude has one additional structure function W3. 

As before, we can now write down a number of relations for the structure 
functions W; using the fact that the scattering process probes the quark structure 
of the nucleon. By analyzing the quantum numbers of the v + N reaction, the 
hadronic current induces the transitions: 


d — ku 
S$ — C 
iin 


c > § (14.41) 
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which appear multiplied by the factor cos? 4c, which we will take to be equal to 
1. Similarly, the Cabibbo suppressed transitions: 


d — ¢ 
so u 
[Fi ens 
f > F (14.42) 


are proportional to sin’ 6c and will be dropped. For i + N, the favored and 
unfavored reactions can be found by simply reversing the direction of the arrow. 
As before, we can write various sum rules by calculating the contribution 
of the various quark distribution functions to the structure functions. In e + N 
scattering, we found earlier that the structure constants were proportional to Q? 
times the quark distribution function, as in Eq. (14.30). For v + N scattering, 
the contribution of the ith quark to F> or x F3 is proportional to g7qi(x), where 
qi is the distribution function of the ith quark, and g? is either cos? 6¢ or sin’ 6c. 
We will set 9¢ ~ 0 for now. The total contribution of the quarks to the structure 
constants F> and x F3 is then the sum over the various quark contributions: 


Fo(x) = 2x >> [g?qi(x) + 939 ;(x)] 


Pi 


KP3G) 


i 


2x S > [g?qi(x) — 934;(%)] (14.43) 


(dof) 


We can read off the quark functions g; that have a nonzero contribution to this 
sum by analyzing Eq. (14.41). For example, for v + p scattering, Eq. (14.41) 
shows us that only the d, s, z, and ¢ quark functions contribute with coefficient 
cos? Oc ~ 1. 

Then the complete list of structure functions, written as sums over various 
quark probability distribution functions, is: 


vp: Ff, =2xld+ svuetc); xia = 2d+s —w—c) 
vn: Fy=2x(u+tst+d+a); xF3= 2x(u+s—d—-d 


‘ (14.44) 
Dp: Fy=2x(utct+d+5); xF3= 2x(tu+c—d-—S) 


tn: Fy =2x(dt+c+ut+S); xFy= 2x(d+c—iu—-S) 


For the most part, we will ignore the contribution of the strange and charmed 
quarks to the proton and neutron scattering function, since the nucleon is primarily 


470 QCD and the Renormalization Group 


made of up and down quarks. Then we have: 


Fy? — FP 2x [u(x) — a(x) — d(x) + d(x)] 


4xT3(x) (14.45) 


where 7; is the isospin density, which integrates to one-half. By integrating this 
expression, we then arrive at the Adler sum rule?: 


if “ [Fy?(x) - F;?(x)] = sf 73(x)dx =2 (14.46) 
We can take the sum of the third structure function: 
Fy? + F3” = —2[u(x) + d(x) — a(x) — d(@w)] (14.47) 
We can therefore write (for zero strangeness): 
Fy? + F;” = —6B(x) (14.48) 


Since the proton has baryon number B equal to one, we then find the Gross— 
Llewellyn Smith sum rule®: 


1 
/ dx [F,?(x) + F3"(x)] = -6 (14.49) 
0 


The experimental value for this is roughly —6.4 + 1.2. 

Historically, many of these sum rules were derived from a variety of related 
viewpoints, such as current algebra and the parton model. Although it was 
gratifying to see the success of these methods, they basically relied on a simplistic, 
free-field approach to the strong interactions. It was a puzzling question why 
this naive approach should work so well and at such low energies. Given the 
complicated nature of the strong interactions, the quark—parton model was working 
well beyond the range of validity originally postulated for it. 


14.4 Product Expansion at the Light-Cone 


Yet another way to see that scaling emerges in the high-energy limit is to use 
Wilson’s operator product expansion,’*® where we can show that the space-time 
region explored by the deep inelastic experiments is near the light-cone. Again, 
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the mystery is why the free-field approximation should work so well in describing 
strong interactions. ; 

The scattering amplitude can be written as the matrix element of the commu- 
tator of two currents: 


Wav = aa al ee p, s|LJu(x), JO)I|p, 5) (14.50) 


We will show that, using the operator product expansion, we can rederive the 
scaling behavior of the form factors found earlier with the parton model. 

We will show this in two parts. First, we will show that the deep aaa 
experiments probe a region of space-time near the light-cone (i.e., si ~ 0). 
Second, we will then show that the operator product expansion of the curiam 
near the light-cone give us the desired scaling of the form factors. 

To see this, let us explore the high qg behavior of the integral, which is dom- 
inated by a region where q - x does not oscillate appreciably. (Regions of rapid 
oscillation cancel each other out.) We expand g - x into its components: 


2 
t; pie x (qo — Kot & 
_ (90+ 93) (Xo ~ x3) | (Go — 93) 0 +3) _ ain 


= oe (14.51) 

42 Ven ER Hp 

Let us go to the rest frame of the proton, so that: 
Pu =(M,0,0,0); qu =(v, 0, 0, Vv? — 4?) (14.52) 


In the deep inelastic limit, where v, —g? — oo with —q?/2Mv held fixed, we 
can show that: 


gt+RBr~’s g-gr~gq/2v (14.53) 


Since we are interested in the region of space-time where q - x ~ 1, we thus have: 


Xo —X3 ~ O(1/v); X90 +3 ~ OC1/xXM) (14.54) 
Therefore: 
xg — x3 ~ O(-1/q’) (14.55) 
This means that: 
x xp — x < x5 — x3 ~ O(-1/q°) (14.56) 
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In other words, in the limit g? + —oo, the integral is dominated by: 


P30 (14.57) 


Therefore W,,, is dominated by the region of space-time near the light-cone. Now 
that we have established the importance of the light-cone, we will now show that 
the operator product expansion near the light-cone yields scaling behavior. 

To see the importance of the operator product expansion, we note that the 
product of two fields taken at the same point is divergent. Our job is to calculate 
the short-distance behavior of the product of two currents J/,,(x)J,(0) and insert 
this expression back into the integral. For free fields, we have: 


I(x) = WO)yy, W(x) : (14.58) 


where Q is a matrix whose eigenvalues give the charges of the various fermions 
in the theory. 

To calculate the commutator, it will be useful to use Wick’s theorem to de- 
compose this product of currents. We will use a simple trick. We will analyze 
the time-ordered product, which yields propagators that have well-known power 
expansions. Then we will convert this time-ordered product into a commutator 
by a change in the singular structure of the fields. We begin by writing: 


T [Ju(x)J,0)] = Tr [iSr(—x)ypi Sr) Q? | 
+ : W(x)y.QiSr(x)y OW) : 
+ ¥(0)y, QiSr(—x) yy, OW(X) : 
+ W(x), OV(x)¥O)Y% QV (0) (14.59) 


The advantage of using the time-ordered expression (rather than the com- 
mutator) is that the propagator has an explicit expression in terms of x space 
variables: 


1 
Agen ~ BU?) + 0) | Jom 2) — IN (mV)) 
3 ES (EK ni ee) (14.60) 


4n2/—x? 


where J,,, N,, and K,, are the standard Bessel functions. For our purposes, however, 
we are only interested in the behavior of this function near the light-cone: x? ~ 0. 
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In this approximation, we have the simple result: 


Ar(x) ~ + O(m*x?) (14.61) 


i 
4n?(x? — ie) 


Near the light-cone, the Feynman propagator for spin-5 fields can therefore 
be written as: 


Sr(x) = (iy -9 —-mAr(x) ~ (iP) ( (14.62) 


L 
4r?(x? — 5) a 


We now make the switch from the time-ordered product to the commutator. In 
space-time, this transition is possible if we make the substitution: 


( 1 ) ae (14.63) 


—x? —ie (n—1)! 


With this substitution, we can now write, using Wick’s theorem: 


Tro? [2 1 
[J.(x), J,(0)] ~ jae | 58! (a2)eC00) + 20,8 [eC] 


x { Syavg [V(x, 0) — V0, x)] 


+ i€yavp [AP(x, 0) — AP(0, x)] hae [5(x)e(x0)] /(20) 


+: Vx)y,OV(x)O)yQv(0) : (14.64) 
where: 

Vox, y) = :W@)y? QO’ yy): 

AP(x,y) = :¥@y?sO?Wy): (14.65) 


and where we have used the fact that: 


awas = (Syvap + i€yvapYs)¥° 
Suvio = 8yv8rp + 8up8vr — Sur8vp (14.66) 
The first term in the commutator does not contribute, since it is a c-number. The 


second term involves bilocal currents defined at two distinct space-time points. 
To evaluate them, we can take a Taylor expansion of the fields: 


xh eS — xi! x2 


W(x/2)v(—x/2) a W(0) (1a 5 ) Hy It] 
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xh — 1 x" x”? 
x 1-5 Iu ta 7) Ov, ,) w+ 
il xh xe xn te Gry FES ~ 
= Dag py OMe, val) 
(14.67) 
Putting this value back into the commutator of two currents, we find: 
1 xHi xh2 Pry, 
[ Jeol /2) J ag 2)) = a ee OS 
1 x! x2 xn owt 0)i o* [3(x?)e(x0)| 
+ DD OBinste- ua OS nave | “Gary 
(14.68) 
where: 
OD cm, = WO) Oy, 9p, *** Op, YPO?¥(0) 
OFipe-tO) = WO) 8p, Bp. °-* Ou, YP rsO7¥O) 
(14.69) 


Now insert this expansion back into the expression for W,,,,. We are interested 
in the averaged matrix element of these operators. When we perform the average, 
the matrix element of O vanishes because Exovg 1S antisymmetric. We only need 
to define: 


1 
5 (Ps SOS npg OP 8) =A Dp Py; Pur Puy to> (14.70) 
AY 


where A“*) are undetermined constants. 
Putting everything back into the expression for the deep inelastic scattering 
amplitude, we now have: 


W I he per 2)" Ae Baa [sr 2 
wlPia)~ og | ome pelt P| Y" —— Syavp p? 8° [8(x7)e(x0)] 
(14.71) 
Now, let us introduce yet another unknown function: 
i Aatd 
Dd, @:p/2 == Fe?) (14.72) 


odd n 
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Thus, our expression for the scattering amplitude has now reduced down to: 


i ve eee 
Wiv(P. 9) ~ 577 Spavp PP / Gay [e'*4 F(x + p)%5(x*)e(xo)] (14.73) 


Everything has now been concentrated into this one integration. To perform 
this integral, it is useful to take the Fourier transform of F(x - p): 


F(x- p)= / dé e'* P5 Fr(é) (14.74) 
Then the expression in the brackets becomes: 
/ dé e'P¥P8(x?e(xo) FE) (14.75) 
We now take the Fourier transform of 5(x7)e(x9): 
| d*k e'* *8(k*)e(ko) = —i (27 )e(x9)8(x?) (14.76) 
Putting this back into the expression for the scattering amplitude, we find: 


] = 
Ww ~ a / dé F(E)5(q? + 2MvE)Syaup(q + Ep)" p* 


= Suv , X PuPv 

polis [- i PEt ..| 14.7 

Oe 7 We eee) 

Thus, we have now shown that scaling occurs, that is, that the form factors are 

functions of x = —g?/2Mv. Furthermore, we reproduce Eq. (14.24); that is, we 
have the scaling behavior of spin-} partons: 


ile 
MW - Fix) = 5 FG) 


vW, — F(x) =xF(x) (14.78) 


(Actually, this last relation is not surprising, since we have taken a representation of 
the hadronic current in terms of free spin- 5 quarks. If we had taken a representation 
of the hadronic current in terms of free fields with different spins, we would have 
derived different relations among the structure functions W, 2.) 

In conclusion, any theory of the strong interactions must reproduce two seem- 
ingly contradictory experimental results: that the quarks seem to be strongly 
bound together in the low-energy region, but that they act as if they are free in the 
high-energy region; that is, they act as partons. 
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Remarkably, we will now see that the gauge theory of QCD can successfully 
reproduce both behaviors. We will now develop the theory of the renormalization 
group, and see that QCD is asymptotically free; that is, the quarks have vanishingly 
small coupling constant in the high-energy region (they act as if they were free 
point-like constituents) but they have a large coupling constant in the low-energy 
region (which binds quarks together into mesons and baryons). 


14.5 Renormalization Group 


The renormalization group equations? ~ !* represent a deceptively simple constraint 
on the renormalized vertex functions of any renormalizable field theory, yet they 
yield some of the most nontrivial consequences. 

The renormalization group equations are based on the simple observation 
that the physical theory cannot depend on the subtraction point at which we 
regularized our theory. We recall that the subtraction point jz was introduced 
purely as a mathematical device to begin the process of renormalization, and that 
no physical consequences could emerge from it. 

This means that if we change the subtraction point 4, other parameters, such 
as the masses and coupling constants, must also change in order to compensate for 
this effect. In order to keep the physics invariant, changing the subtraction point 
must be offset by changes in the renormalized physical parameters as a function 
of the energy. 

There are several equivalent ways in which to view this highly nontrivial 
feature of renormalization theory: 


1. If we adopt the formalism of counterterms and subtractions, then there are 
an infinite number of ways in which to split the unrenormalized action 4 
into the renormalized piece _% and its counterterm AY. This is because 
there is the ambiguity of how to split % between the renormalized action 
and the counterterm, as we saw in Chapter 7. Changing the subtraction point 
2 Creates a corresponding change in the value of the renormalized physical 
parameters, so that there are an infinite number of possible renormalizations 
[see Eqs. (7.71)-(7.75)]. However, the physical quantities at fixed energy 
must be independent of how we make the split, and this independence is 
mathematically expressed in terms of the renormalization group. 


2. If we adopt the alternative viewpoint of multiplicative renormalization, then 
we have a multiplicative relation between the vertex functions of the un- 
renormalized theory Ie and the vertex functions of the renormalized theory 
Tr. However, the unrenormalized vertex function re is totally indepen- 
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dent of the subtraction point yz (since subtractions are computed only for the 
renormalized vertex): 


a 
aunt =0 (14.79) 


Thus, to keep the unrenormalized vertex function ne independent of jy, it 


means that there is a nontrivial relation between the renormalized [ and Z, 
which is expressed mathematically as the renormalization group equations. 


3. The group nature of the renormalization group can be seen more abstractly 
if we let R represent some (unspecified) renormalization scheme. If Io is an 
unrenormalized quantity and I’r is same quantity renormalized by the scheme 
R, then: 

Tr=Z(R)Vo (14.80) 
where Z(R) represents some renormalization constant under the renormaliza- 
tion scheme R. 

Let us now choose a different renormalization scheme R’. Since the 
unrenormalized quantity [9 was independent of the renormalization scheme, 
then: 

Tr =Z(R')To (14.81) 
Then the relationship between these two renormalized quantities is given by: 
Tr =Z(R', Rp (14.82) 
where: 
Z(R’, R) = Z(R’)/Z(R) (14.83) 
Trivially, this satisfies a group multiplication law: 
ZR? R )Z(R eR) = ZRF) (14.84) 


where the identity element is given by: 


Z(R, R)=1 (14.85) 


Now that we have explained the origin of the renormalization group equa- 
tions, let us try to find a mathematical expression for these relations. While the 
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unrenormalized vertex functions are independent of jz, the renormalized ones 
are not. For example, in ¢* theory, we have the following relationship between 
unrenormalized and renormalized quantities: 


T(p;, go, mo) = Zy"T (pi, g, m, M1) (14.86) 


where jz is the subtraction point, and we assume that we have used some regular- 
ization scheme to render all expressions finite for the moment. 

Now let us differentiate this via the dimensionless derivative 4(d/du). We 
know that the unrenormalized bare quantity is independent of the subtraction point, 
so that the derivative acting on the unrenormalized quantity must, by construction, 
be zero: 


0 


0. =e” 
aE 0 
= (uz?) 1425"? (po rm (14.87) 
au ? ¢ ou j 


We now use the chain rule. We choose as our independent variables yz, g, and m: 


ad od o0g0d0 dma 
: gO am a 


a SS ee to ; 
dz du dudag audam ey) 


Let us make the following definitions (where we now take the limit as € — 0): 


a 
Bg) = Da 
a 
Vie Ma los VZ¢ 
om 
MYn(g) = au (14.89) 


With these definitions, we now have the compact expression: 


eg 200 he 0 \ ri) s 
(u2 +p nv(e)+myn(8) = )T (D9 8. tte) =O (14.90) 


These are the renormalization group equations, and they express how the renor- 
malized vertex functions change when we make a change in the subtraction point 
ii: 

(In principle, the parameters like 6 can also depend on the dimensionless 
quantity m/y. Then the renormalization group equations become difficult to 
solve as a function of two independent variables g and m/j. However, we can 
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ignore this dependence on m/z if we adopt the “‘mass-independent regularization 
scheme,” or the “minimal subtraction scheme,” which we will discuss later in 
Section 14.8. We tacitly assume that we adopt this regularization scheme.) 

The importance of the renormalization group equations is that they tell us how 
the renormalized functions change as we vary the subtraction point ~. We know 
that no physics can emerge by a change in the subtraction point; so a change 
in the subtraction point must be compensated by a change in how we define the 
renormalized coupling constants and renormalized masses. The renormalization 
group equations perform the book-keeping necessary to keep track of how these 
other variables change when we change the subtraction point. 

From our point of view, the most important parameter is 6. Knowledge of 
B determines the behavior of the coupling constant as a function of the mass 
scale. (We should also point out that the functions B, etc. are dependent on the 
regularization scheme that we use. Although the physics remains the same, the 
exact form that these functions take varies with different regularization schemes.) 

We first note that we can solve the expression for the 8 function. We simply 
divide by 8 and multiply by dy: 


cles RS (14.91) 
uw B(g) 
Integrating, we have: 
g(t) d 
nee i ae (14.92) 
Lo a(tto) B(g) 


where {4g is some arbitrary reference point. For the moment, let us assume that, 
for small g, we Taylor expand B: 


B ~ bg" +--- (14.93) 


for some coupling constant g and integer n. Then, inserting this value of B into 
the integral equation, we can perform the integration and arrive at: 


n—1 
fee ay) (14.94) 


n — 1)bg(uo)"—! log (4/110) 


Our goal, however, is to analyze the behavior of the theory at high energies, 
so let us make the following scale transformation and derive a slightly different 
constraint on the vertex functions. If we scale the momenta via: 


pi — e' pi = Api (14.95) 
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then, using dimensional arguments, the vertex function behaves as: 
T™Ap;, 8,4) = uw? fA" pi - pj/u") (14.96) 


where D is the dimension of the vertex function. (This is because I’ is a Lorentz 
invariant, and hence can only be a function of the various dot products p; - p;. To 
create a dimensionless quantity out of this, we must divide this by 47. The over 
all scaling quantity 2? means that the function has dimension D.) 

This, in turn, implies that the vertex function obeys the following equation: 


p+ — D/P Qp;, gn) =0 (14.97) 
du at 


where A =e’. 
Now let us eliminate the term j4(0/02)T from this equation using Eq. (14.90). 
Then we find: 


A a 
(-5 + B(g)— +[D +nv(o)l) Tipe, 20 (14.98) 
t dg 


If 6 were equal to zero, then the scaling behavior of the vertex function would be 
given by: 


r” _, ( ayPrnv(s) (14.99) 


which is the scaling behavior of the vertex function with the additional y(g) 
correction. This the reason why y(g) is called the “anomalous” dimension. The 
important point is that 8 and y measure the deviation from naive scaling. 

Fortunately, we can solve this equation. Let us introduce the function 2(g, f), 
called the running coupling constant, such that: 


dg(g.t) _ o/. 
SS = BC) (14.100) 


with the boundary condition that 2(g, 0) = g. Then the solution is given by: 


Te) 
B(g’) 


g 
PApi, 8) = P(pi, 8) exp fi dg (14.101) 
& 


where = D +ny(g). To prove this, substitute it directly into Eq. (14.98). 

To analyze the nature of these solutions, Jet us make a few definitions. 

Let a fixed point represent a zero of the 6 function for some value gr. The 
origin of the name comes from the fact that if the coupling constant were near this 
fixed point g-, it will remain there as we increase ju. 
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Big) Big) 


a 


Figure 14.3. In (a), the slope of is negative, giving us an ultraviolet fixed point. In (b), 
the slope is positive, giving us an infrared fixed point. 


To see this, let us analyze the situation in Figure 14.3 and power expand B 
around the fixed point: 


a) 
Be ae (5 ee ne (14.102) 


In Figure 14.3(a), the slope of 6 is negative at the fixed point g-. Consider 
what happens, as jz increases, when g is near gy. If g is less than g-, and if 
f’ < 0, then the two signs cancel in the Taylor expansion and 0/0 is positive, 
so g rises with rising jz. This means that g is driven towards gr for increasing j. 
If, however, g is larger than g- and f’ < 0, then the derivative dg /dy is negative, 
so g decreases with increasing yz. Thus, g is driven downwards back toward gr 
with increasing jy. In both situations, g 1s driven towards g- with increasing pL. 
We call this an ultraviolet stable fixed point. 

Now consider the situation in Figure 14.3(b), where the slope of 8 is positive 
at gr. Then g is also driven towards g-, but for decreasing values of w. If g is 
less than g- and f’ > 0, then dg/dy is negative. Thus, for decreasing jx, g will 
increase in the direction of gr. Likewise, if g is greater than gr and f’ > 0, then 
dg/d is positive, and hence g will decrease towards g- if 4 decreases. 

We can summarize this situation as follows: 


B'(gr) <0: Ultraviolet stable 


B'( geet (14.103) 
gr)>O0: Infrared stable 
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Let us analyze some theories in this context. We know that, for ¢* theory, the 
coupling constants are related by: 


2g 
= gue {1 tree 14.104 
80 = Su ( 33 oe) ( ) 


Differentiating, we have: 


0g 
B a 
Be 
= fe € 
pe eg te ieee 
39° 
i6n2 (14.105) 


in the limit e — 0. 


The theory is not asymptotically free because of the positive sign of f. In fact, 
it is easy to integrate the previous equation as a function of jz, and we arrive at: 


80(Ho) 
() = — SS (14.106 
- 1 — (3/167) go(40) log(u/ Ho) 
Clearly, increasing jz increases g. 
Next, let us investigate QED. As before, we know that: 
€& = €/2 éi = ens? 
Z2 V Z3 V Z3 
€/2 e 
= 1 ae : 
eu ( + cer) + (14.107) 
Differentiating this equation to solve for B, we find: 
de 
B= wey) 
Sie 
See a Oe 
Be 


as € — 0. As with the ¢4 theory, we find that 8 > 0, so that the running coupling 
constant e increases with larger energies. In fact, we can easily integrate this 
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equation, arriving at: 


e*(Uo) 


or 14.10 
i — @%(uo)/6n?) log = one 


e*() = 


which we derived in Section 7.4, using simpler methods. 
The coupling constant increases with jz, and there is the Landau singularity 
at: 


6n? 


(Although it appears as if the coupling constant blows up at this point, we must 
realize that the formula breaks down in the approximation we have made, i.e., for 
small e only.) 


14.6 Asymptotic Freedom 


One of the theoretical breakthroughs in quantum field theory came when the high- 
energy scaling behavior found at the SLAC experiments could be explained via 
non-Abelian gauge theory. 

Previously, in Eq. (13.13), we found that the coupling constant renormalization 
in gauge theory was given multiplicatively by: 


Zz 
- €/2 1 
= gu (14.111) 
di Zo/Zs 
If we put in the values of the Z’s, we find: 
Se emia OL Memeo | are (14.112) 
§0 = &U CIE N Gt a . 
We can then solve for B: 
dg 
oe an 
a =o el 2 
= cage = (Ce Pe 
(& ') (sea) (a 31) 
3 
g 11 4 
= ——~[({—Cga—=Cy] t+ 14.113 
£5 (5 13 ;) ( ) 
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We come to the rather surprising conclusion that the theory is asymptotically free 
if the following prescription is satisfied: 


11 4 
—Caa > 3 


5 Cc, (14.114) 


This is the first example of asymptotic freedom,'3—'° which was discovered 
by Gross, Wilczek, Politzer, and independently by ’t Hooft. Asymptotic freedom 
only occurs in the presence of gauge theories. For QCD, we have the group SU (3), 
so that Czg = 3. The final relationship now reads: 


g° 2 
B(g) = —7o—5 (1 we aN) (14.115) 


where N;¢ is the number of flavors of fermions in the theory. This is one of the 
theoretical triumphs of gauge theory, that gauge theory proves to be the most 
important ingredient in any asymptotically free theory. 

The value of the coupling constant can also be integrated explicitly. Performing 
the integration, we find to the one-loop level: 


g°(u) = 2 ee (14.116) 
1 + (9?(40)/82) (3. Caa — $C) log w/o 


Because of the importance of asymptotic freedom, the 6 function has even 
been computed to three loops: 


" 5 
g z g 38 
= 2 ee eee _ 38 
k al 3 ‘) (l6n2p2 (102 5 7) 
7 
_ (2857 5088 325 ; 
aces ( 5 nee ee a) 


(We should mention that the 6 function actually vanishes to all orders in pertur- 
bation theory for certain forms of super Yang—Mills theories, which are finite to 
all orders in perturbation theory. This will be discussed in more detail in Chapter 
20.) 

In summary, asymptotic freedom means that, roughly speaking, at shorter 
and shorter distances, the coupling constant decreases in size, so that the theory 
appears to be a free theory. This is the phenomenon of scaling, which is simply 
interpreted as the quarks acting as if they were free partons in the high-energy 
realm. 

Conversely, at larger and larger distances, the coupling constant increases, so 
that at a certain point perturbative calculations can no longer be trusted. Large 
coupling constants, in turn, imply that the quarks bind more tightly together, — 
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giving rise to confinement. This is called “infrared slavery,” which is the flip side 
of asymptotic freedom. 

Finally, we remark that there is a simple way in which we can describe 
asymptotic freedom, which only manifests itself in non-Abelian gauge theories. 
Although this example does not explain asymptotic freedom, it gives us a conve- 
nient intuitive model by which to describe it. 

In the case of QED, we know that, at large distances, the effective coupling 
constant a gets smaller. This is because any charged particle is surrounded by a 
dense cloud of electron—positron virtual pairs that tend to screen the charge of a 
particle. Thus, the effective coupling constant is reduced by the presence of this 
screening charge. At smailer distances, and higher energy, a probe can penetrate 
through this virtual cloud, and hence the QED coupling constant gets larger as we 
increase the energy of the probe. 

Classically, we can think of this in terms of the dielectric constant of the 
vacuum. If we place a charge in a dielectric, we know that the electric field of 
the dielectric causes the dipoles within the dielectric medium to line up. The net 
effect of the dipoles lining up around the charge is to decrease the charge, so the 
medium has a dielectric constant greater than one. 

The situation in QCD, as we have seen, is precisely the opposite. We no 
longer have an electric charge (since QCD gauge particles have neutral charge), 
but we have color charges and color coupling constants. This means that, at 
large distances (low energy) the presence of the cloud of virtual particles creates 
an antiscreening effect. The net coupling constant get larger at large distances. 
Contrary to the situation in QED, a probe that comes near a colored particle feels 
the coupling constant decrease at high energies. Thus, the dielectric constant of 
the vacuum is less than one for an asymptotically free theory. 


14.7 Callan—Symanzik Relation 


We now would like to clarify certain points that were ignored earlier. We pointed 
out after Eq. (14.90) that £ is not, strictly speaking, just a function of g. It can also 
be a function of the dimensionless parameter m/z, and hence the renormalization- 
group equations become much more difficult to solve. Therefore our previous 
derivation of the scaling relations, although correct, was actually incomplete. 
There are several ways in which to complete this subtle but important step. 
The first is to use a slightly different form for the renormalization group equations, 
called the Callan-Symanzik relations,'':!* which are written as derivatives with 
respect to the bare masses, rather than the subtraction point jz. In this case, 6 and 
y appear in slightly different form, but are now functions of just one variable, the 
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renormalized coupling constant. Then the renormalization group equations can 
be rigorously solved, and we find the scaling relations derived earlier. 

There is also a second solution, which is to use a different regularization 
scheme, called the mass-independent minimal subtraction (MS) scheme, !° where 
the mass dependence drops out from the very beginning in the definition of B. 
Then we can ignore the mass dependence of these functions because of the way 
that we have regularized all divergent integrals. 

We will discuss the first solution to this subtle problem using the Callan— 
Symanzik relations, where the renormalization group equations are derived from 
a slightly different set of physical assumptions than before. Then later we will 
discuss the MS scheme. 

We begin with the obvious identity that the derivative of a propagator. with 
respect to the unrenormalized mass squared, simply squares the propagator: 


) i i i 
Seed ig a ee ee ae 14.118 
ams ice prmaneae | gtaalanaepage ( ) 
or simply: 
Oe : — 
Ae =iAr(—i)iAr (14.119) 
amo 


Now assume that i Af occurs in some vertex function R” of arbitrary order. 
Each time a propagator appears, the derivative replaces the propagator with the 
square of the propagator. From a field theory point of view, the squaring of the 
propagator (with the same momentum) is equivalent to the insertion of the operator 
¢*(x) in the diagram with zero momentum. [We recall that the addition of the 
counterterm 5m7@? into the action had the net effect of converting each Af into 
Az.. In the same way, the squaring of each propagator can be simulated by the 
insertion of @?(x), which acts like a counterterm. ] 

This means that the derivative of an arbitrary vertex function with respect to 
me yields another vertex function where ¢*(x) with zero momentum has been 
inserted. In other words: 


ary (Pi) _ an 
og = IT gga (05 Pi) (14.120) 
0 


where T° ee represents a vertex function with the insertion of this composite 
operator. 

We now make the transition from the unrenormalized vertices to the renormal- 
ized ones. This means the introduction of yet another renormalization constant 
Z 42 to renormalize the insertion of the composite field operator. 
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As before, the relationship between the renormalized and unrenormalized 
vertices is given by: 


P™(p;,g,m) = Ze? T$?(pi, go, mo) 


Ze Zg Thja(P; Pi, 80.0) (14.121) 


ro, Pi, 8; m) 0,62 


Now we use the chain rule to write: 


i (14.122) 
ame Am? Am? ~——- Am? ag ; 


As before, we now apply this operator on the unrenormalized vertices, which then 
picks up derivatives of the renormalization constants Zg and Z 42. 
Putting everything together, and dividing by 9m?/dmz@, we find: 


A) a 
i ae P¢p. ~ _imte TON n, 
(mo Poe nv) (pi, g,m) im als (0, pi, Z,™m) (14.123) 


where: 


dm> \ ams 
as 5d log Zy a) 
am? am? 
9Z42 (am? 
ae = (Fe (14.124) 
dmo dm 


Although these equations look suspiciously like the previous renormalization- 
group equations in Eq. (14.90), there are many subtle but crucial differences. 
First, the definitions of the parameters, like B, are different from the usual ones. 
Most important, the previous renormalization group equations were written as 
derivatives with respect to the subtraction point jz, while the new ones are written 
with respect to the unrenormalized mass mo. Second, it can be shown that 
these functions are strictly functions of just one variable, the coupling constant. 
Hence, they can be solved using the methods outlined earlier. Third, there is an 
inhomogeneous term on the right-hand side of this equation, while the previous 
renormalization group equation did not have this term. 

Next, we would like to show how to eliminate the inhomogeneous term ap- 
pearing in the Callan—Symanzik relation that does not appear in the original 
formulation of the renormalization group equations. To eliminate this term, we 
will appeal to Weinberg’s theorem. In the version that we need, this theorem tells 
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us that if we scale the external momenta as p; — Ap; in the deep Euclidean region, 
the one-particle-irreducible Green’s functions (p;) grow as A*~” times lower 
polynomials in log A, while the Green’s function Ry grows only as A?~” times 
similar polynomials. We note that this divergence is just what one might expect 
from naive dimensional grounds, and is also the superficial degree of divergence 
of the graph. 

Mathematically, Weinberg’s theorem tells us that: 


PO Op;,¢,m)— ar" (> a, (log A)™ e (14.125) 
71 


for some constants a,, b,. (In principle, the logarithms can sum to a nontrivial 
expression, giving us the possibility that the entire expression scales as A*~"~’, 
where y is just the familiar anomalous dimension.) 

For our purposes, the important point is that scaling takes us into the deep 
Euclidean region, where ) is much larger than Be , So that we can drop the 
latter term. 

The Callan—Symanzik equations then become homogeneous, like the previous 
equations given earlier, in this limit. Then the equations can be solved, much like 


Eq. (14.101). 


14.8 Minimal Subtraction 


Finally we remark that it seems remarkable that the renormalization group equa- 
tions work at all, that is, that we can extract information concerning the higher- 
order behavior of the coupling constants knowing only the one-loop results. 

For example, if we power expand the coupling constant g, we find an infinite 
series of logarithms. The renormalization group equations, on the basis of just 
the one-loop results, are able to reproduce the leading logarithmic behavior of the 
entire function, without having to compute any higher-loop Feynman diagrams. 

To see the origin of this rather mysterious but important result, it is perhaps 
instructive to use what is called the minimal subtraction scheme.'®. The MS 
scheme defines the renormalized coupling constants strictly in terms of their poles 
using dimensional regularization. Since we have the freedom to choose where we 
separate the infinite part from the finite part, we will define the subtraction scheme 
so that we only take the poles in €, so the counterterms have no finite parts. 

The MS scheme has a further advantage because it is a mass-independent 
regularization scheme. The Z’s depend on yu only through the renormalized cou- 
pling constants. By dimensional arguments, hence the MS scheme produces the 
functions f, etc., which are independent of the renormalized mass. We mentioned 
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earlier that our original derivation of the renormalization group equations in Eq. 
(14.90) ignored the fact that 6 could, in principle, be a function of both g and m/w, 
which made solving the renormalization group equations difficult. We ignored 
the dependence on m/,z because there exists a regularization scheme, the MS, in 
which 8 appears strictly independent of the renormalized mass m. 

To begin our discussion of the MS, we define our unrenormalized variables as 
follows: 
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The coefficients in the expansion are independent of m/z, without any finite parts 
at all. We also assume that u(dg/dj) is a smooth analytic function of €, so that 
we can expand: 


ge 2 
w=) dhe (14.127) 


Now the key physical input is this: the bare quantities are all, by definition, 
independent of the subtraction point. Thus, we can differentiate them and set 
them to zero. Thus: 


u-< 280 <9 (14.128) 
dp 
or: 
02 wr _n|98n 98 
wae? La |escAet)) ac a =0 14.129 
bal gh ay kas Rn! ( ) 


This is a set of nontrivial, highly coupled equations linking the various terms in 
the MS scheme. To solve them, let us insert the power expansion of u(dg/d) 
into the previous equation and sort out powers of e: 


dgi el dg, Agri \ _ 
c(gtdi)+( +do +a )ey en ( vO ig ser eH) (14.130) 
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Since each order of € must vanish separately, we now have the equations: 


d 
di=-g; git+dig =—do; (: +d<) Sn+i = —dog,, (14.131) 


as well as: 
zp [n41 — 81] = 8 (14.132) 
dg le 


This is an important recursion relation, because it shows that the residues of the 
higher-order pole terms can, in principle, be determined from a knowledge of just 
the simple pole term. 

We can repeat the same steps for the other parameters. Since the other 
unrenormalized parameters are also independent of the subtraction point uw, we 
know that dmo/d = 0 and d¢o/du = 0. We can therefore derive the recursion 
relation for the other residues: 


tl 


d 
gM 41 mngm, — mi, (1 = s-) 2, =0 


d 
8Pna1 dng, — %, (1 — s<-) 21 (14.133) 


In terms of the original renormalization group parameters, we also have: 


B(g) = 21+ ga; 
Vig = ge, 
Ym(g) = gb} (14.134) 


We can draw several interesting conclusions from this simple exercise. It 
is possible to construct a self-consistent renormalization scheme based entirely 
on dropping all finite parts in the counterterms. Thus, the counterterms are 
chosen to cancel just the poles, nothing more. Then the mass dependence within 
the renormalization group parameters disappears, and our previous assumption 
about dropping the m/j dependence is justified. (We should point out there 
exists a modified MS procedure, called MS, which is used extensively in the 
literature. In the MS scheme, we eliminate the poles along with certain finite 
transcendental constants.) Furthermore, the higher-order terms (in principle) can 
be determined from lower-order terms by a recursion relation. The nth-order 
coefficients are all determined from the (n — 1)st-order terms. In other words, the 
renormalization group equations tell us that the knowledge of the lowest-order 
terms will automatically determine much of the higher-order behavior, without 


14.9. Scale Violations 491 


actually having to compute all higher-order graphs. Thus, it is now no mystery 
why the renormalization group equations only need, as input, the lowest-order 
one-loop results, yet manage to determine much of the higher-order behavior 
without having to perform multiloop calculations. The mathematical essence 
of the renormalization group equations is that they are a recursion relation that 
“bootstraps” all the higher coefficients from a knowledge of just the lower ones. 


14.9 Scale Violations 


The actual experiments done on deep inelastic scattering not only give us the 
scaling behavior, they also give us the deviation to exact scaling, which we would 
now like to calculate using renormalization group methods. In particular, we will 
write down the renormalization group equations for the structure functions found 
in lepton—nucleon scattering. 

In the language of the operator product expansion given earlier, we can write 
the behavior of two operators near the light-cone: 


AGB OV) yeNGVrui  2 3,01 a) (14.135) 


where we no longer assume that we are dealing with free quark states. The 
singularity found earlier for the free quark model can be included in the C? 
function. The summation is performed over the spin n of the operators, and also 
the type i, which is not yet specified. 

We know that the dimension d,4 + dg of the left-hand side must be equivalent 
to the dimension of the right-hand side, which is given by —n + do,. By scaling 
arguments, we then have the behavior of the coefficients near the light-cone: 


—(datdgtn—do, \/2 


Ca y~ Ge (14.136) 


Examining the power expansion, we see it is in general dominated by the 
operator with minimum twist, which is defined by: 


Twist = t =do, —7 (14.137) 


that is, the twist is equal to the dimension of the operator minus the spin. For 
example, simple operators of t = 1 are given by: 


gb; Ind; 9,0,0 (14.138) 
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In deep inelastic scattering, the relevant operators are composed of quark fields 
y and gluons A“, so the minimum twist operators have t = 2 and their product 
expansion is given by: 


iT [J.(x)J,0)] a Se a Ties 8+) 


nt 
+ Buys SvurXps ***Xuqb” CZ A(X”, 8, »)| Oe a) 


(14.139) 


As before, by Lorentz invariance we can write the matrix element of O; as a 
function of the momentum p: 


(pl OMhe"B |p) = A? (p4! p¥... p#« + trace terms) (14.140) 


where A? are undetermined constants, and where the trace terms arise because the 
operator is traceless and symmetric. 

In this form, the traces are a bit unmanageable. But we will use a trick. Each 
Pu; 18 contracted onto a x,,,, which in turn can be converted into 0/0q,, when we 
take the Fourier transform of the expression. Then we can use the identity: 


Cr) ) Oo Na 
— vee = Qi gli gz... gb (<3) + trace (14.141) 
94u, 99p. 89u, dq? 


With this substitution, we can absorb all the trace terms into a single differential. 

The goal of this process is to be able to determine the nature of the structure 
functions of lepton—nucleon deep inelastic scattering. We are interested in the 
tensor: 
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(14.142) 


The relation of T; to the previous structure functions found earlier is given by: 


1 
Wi2= 7m Ti,2 (14.143) 


14.9. Scale Violations 493 


Inserting Eq. (14.139) into Eq. (14.142), we find: | 


eee Spades a 
Ty = om D[- 8 (724) LilG’ s&s Mt) 


al 


+ pyupy(2p-q)"2(—¢2)' "C3 (@, 8, yar (14.144) 


where Q? = —q? and: 


CWO? eo) 


peta? 
(Q*) (=) [ ateemene?, §; i) 


Vand 2\n— a) eo ig: 
Chao Bo pL) = (Q°) ; (=) a et ACG &; 7) 


(14.145) 


Comparing this expression for T,,,, with the original definition, we easily find: 
1 nf n 
Tie, O°) = se Dx "CLO, 8, WA} 


T(x, Q”) 


! —nt+l fn 2 
aM pee Cz i(Q", 8, MA; (14.146) 
Taking the moments of 7),2 with respect to powers of x, we then find: 


I 
= 1 an n 
[ dx x" 2 F(x, Q’) ~ g DS Chie. 8; LA; 


0 


1 
~ 1 an n 
[ dex MC) ~ GICLO% 8. WAT (14.147) 


Up to now, we have not used the power of the renormalization group. Since 
these form factors are physical, measurable quantities, we now impose the fact 
that they must obey renormalization group equations of the form: 


a d fan 2 
[ +A(@)3" Ojk — “| Cod ey = 0 (14.148) 
The solution to this equation is easy to find: 


€5,(0/u?, 8) ~ C5, BeOep { - i ay, (e«')| (14.149) 
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where t = (1/2) log(Q?/”). We now use the fact that, to lowest order, we have: 


—bg? + O(g*) 


B 


hn 
Vij 


dg? + O(g°) (14.150) 


(The d7,, in fact, are exactly computable using lowest order perturbation theory.) 
Then the Wilson coefficients obey: 
—d?. /2b 


Cn (0? /u?, 2) ~ > Cr (1, 0) [log (Q?/u”)] (14.151) 
Jj 


Reinserting these equations back into the expression for the integral of the mo- 
ments, we find: 


1 
Mi(Q2) = / dx x" F(x, 02/2) 
10) 
1 = vis 
~ 065,004" [log (02/02) 
| 
MNQ2) = L dx x" F(x, Q?/p2) 
0 
—d",/2b 


N OK, (1, 0)A” [log (Q?/n7)] (14.152) 


This is our final result. With a few modest assumptions and the renormalization 
group equations, we can compute the logarithmic corrections to Bjorken scaling. 
The point is that F; is now a function of both x and Q, but the momentum 
dependence is given by logs, which gives a weak violation of Bjorken scaling, as 
observed experimentally. 


14.10 Renormalization Group Proof 


We began our discussion of renormalization theory in Chapter 7 with ¢* theory, 
but did not complete it because of the problem of overlapping divergences; for 
example, there was no unique skeleton expansion of certain graphs, giving us 
the headache of the overcounting of graphs. Although the ¢* renormalization 
program was simpler than the one for QED, the final step could not be com- 
pleted because the skeleton reduction was not unique. For QED, however, the 
renormalization program, although more difficult, could be completed because the 
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Ward—Takahashi identity allowed us to write the vertex graphs (which do not have 
any overlapping divergences) as the derivative of the self-energy graphs (which 
have overlapping divergences). Then we could write all recursion relations strictly 
in terms of graphs that have no overlapping divergences. The essential idea was 
that taking derivatives of self-energy graphs, which have no skeleton reduction, 
creates insertions of zero momentum photons. The insertions of these photons, 
we saw, converts a self-energy graph (without a skeleton reduction) into a vertex 
graph (with a skeleton reduction). 

Now, the renormalization group equations can be viewed as the “Ward” identity 
for scale invariance. They will allow us to complete the renormalization of ¢4 
theory. The key, once again, is that taking the derivative of a self-energy graph 
creates insertions of ¢? that give us vertex graphs that have a skeleton reduction. 

We remarked earlier than almost any functional recursion relation, linking 
the (r + 1)st order term to the rth term, can be used as a basis to prove the 
renormalizability of field theory if they have no overlapping divergences. We 
recall that the Callan—Symanzik relations were derived by taking the derivative 
of a vertex function with respect to the mass. This squared the propagator, which 
could then be interpreted as the insertion of the operator ¢? = @ into the theory. By 
expanding out the derivative with respect to the unrenormalized mass, we found: 


a a ' 
u— + B(g)— +ny(g) | T(p;g, ) —ip?a(g)P oO, p, g, u) 
Ou 0g 


—ip?a(g (0, 4g, p, 2, UL) 


a a n n 
(uz + B(s)5 +nv(s)) DEG. P,8, tt) 0B 


(14.153) 


where the second equation arises by taking the second derivative with respect to 
the mass squared. 

Our approach will now be to treat the Callan—Symanzik relations as the equiv- 
alent of the Ward—Takahashi identity, giving us functional recursion relations that 
will allow us to complete the induction step.'”:!8. We will discuss the $4 theory, 
but the method is quite general. We can renormalize QED and non-Abelian gauge 
theories without too much difficulty. 

We recall that, for ¢* theory, the vertices [ for n > 4 all have a degree 
of divergence less than zero. Furthermore, they have a skeleton expansion. In 
this case, this means that they do not have subgraphs that have positive degree of 
divergence (i.e., there are no nontrivial insertions of © and [). Since both the 
overall divergence is negative and the divergence of all subgraphs is also negative, 
then the graph itself is convergent. If we can understand the behavior of and 
I), then we can determine the behavior of all the ') for n > 4 by a skeleton 
expansion. We will thus concentrate on these two types of vertices. 
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: (4 (2) (2) ; 7 

We will thus assume that [I] Compe (ey wy and [rs fb are all finite quan 

tities, where r denotes the order of the perturbation expansion. We also assume 
that [B]¢-+1) and [y];,) are all finite as well. 

Our task is then to show that renormalization-group equations allow us to 


é 3 ‘ 2 F 
complete the induction step, that is, to calculate Ll oad ,and [rs | 4 
. Lf * 
terms of the known quantities at a lower order. A close look at the renormalization 
group equations shows that they are ideally suited for such a task. 


The calculation is then carried out in three steps: 


1. Calculate [] _, in terms of the finite quantities. 


2. Calculate je | 


r+1 


3. Calculate [T | 


r+1° 


14.10.1 Step One 


The first calculation is rather easy, since we can write the renormalization group 
equations in the following fashion: 


) 
ee) SSeS) = | (P@z +4y) r®| ca 
[ ie [ e ee dg r+2 


The first term on the right-hand side has a skeleton expansion. This means that, 
at most, it contains (r)] eer] , and Le and hence, by construction, it 
1g 


Be 
is finite. (One might suspect that the overall integration over these finite pieces 
might contribute a divergence, but since the superficial divergence is —1, there is 
no problem.) 

The last term in the previous expression causes some problems, since we have 
the term [8],42 and [y],,; multiplying the lowest-order term in I (which is 
—ig). Therefore one might worry about the term [8],+; + 4g[y],+) that appears 
on the right-hand side. 

However, we now use one more bit of information to show that this last 
remaining term is finite. If we take the zero momentum renormalization group 
equations, we find that they reduce to: 


[B+4ygl42 =u? [ro] (14.155) 


r+2 
But we already showed that this was finite (since it it had a skeleton decomposition 
given by finite quantities). 
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In summary, all terms on the right-hand side have been shown to be finite. 


The last step is trivial, which is to integrate both sides of the equation in order to 
calculate [T] 


We can write the renormalization group equation as: 


a 
Ba LE (e/te, A/t 8], 49 = [B (P/ Ms A/ Ms Brae (14.156) 


where we have now included explicitly the argument of the vertex functions, and 
have written the right hand side as simply ®. Integrating, we have: 


1 
d 
[T° (p/w. A/us8)],. = —i8 =| = [®(ap/u,aA/u32)),47 (14.157) 


Thus, we have shown that all terms on the right-hand side of the renormalization- 


group equations are finite, so therefore [I] _, is also finite. 


14.10.2 Step Two 


In the second step, we will rewrite the second renormalization group equation as: 


a eee ees) 
eer) [rs r+i me [ors | 
- | (Bez +2y + ») r?| (14.158) 
0g r+l 


This can also be shown to be finite by repeating the same steps given earlier. 
We first remark that by is finite since it has a skeleton decomposition and can 
be calculated in terms of finite, lower-order parts. The only troublesome term is 
the one on the right, which appears in the combination [2y + yo]-+1, which does 
not appear to be finite. However, as before, we simply take the zero momentum 
limit of this equation. Then this precise combination [2y + yg] appears in the 
low-momentum limit of ['g9, which we just showed to be finite. Finally, we then 


: “ ; : é 2 
integrate the entire equation in y to arrive at an expression for [rs yl 
ie 


14.10.3 Step Three 


Finally, to compute the remaining function [[] |, we write the first renormal- 
ization group equation as: 
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a | (pce 5- Be 2v ss) (14.159) 
0g r+i 


We also invoke similar arguments to show this is finite. First, we know that 
ie is finite. Therefore, the only troublesome term is the one involving 


[B],+2 "and [LY ]r41- 
Although this last term does not appear to be finite, we can power expand the 


previous renormalization group equation around p* = 0, and we have: 


d 
yd = Ww? Ce Ge 


= (14.160) 
r+] 


This allows us to determine that [y],,) is finite, since the right-hand side is 
finite. Also, we can show that [8],42 is finite if we review our discussion of the 
finiteness of [T] 

Thus, the entire right-hand side of the renormalization group equation can be 
shown to be finite, so a simple integration over yz yields [T Pe entirely in terms 
of finite quantities. 

In summary, the SLAC deep inelastic scattering experiments demonstrated 
the importance of Bjorken scaling. The simplest explanation of scaling comes 
from the naive quark model using either a parton description or light-cone com- 
mutators. However, this did not explain why the naive quark model should work 
so exceptionally well, why strong interaction corrections could be ignored, or 
why scaling set in so early. Ultimately, the scaling experiments were explained 
in terms of gauge theory and the renormalization group. We have seen that the 
coupling constant can change with the energy via the renormalization group equa- 
tions. Since f is negative near the fixed point for theories with gauge fields, 
we can prove that QCD is asymptotically free; that is, at asymptotic energies, 
the coupling constant goes to zero. Historically, the explanation of the SLAC 
experiments by asymptotically free gauge theory helped to convince the scientific 
community of the correctness of QCD, even though quarks have never been seen 
experimentally. 

Although the Standard Model has enjoyed great experimental success, there are 
still important gaps left unanswered by the theory. In particular, we cannot explain 
the low-energy spectrum of the hadrons until we understand quark confinement. 
Furthermore, we cannot understand the origin of the generation problem, the 
origin of the quark masses, etc. unless we go to theories beyond the Standard 
Model. In Part III, we will turn to these questions. 
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14.11 Exercises 


tN 


eS) 


8. 


. Show Eq. (14.13). 


. Rewrite the light-cone commutator of two currents using scalars, as in Eq. 
(11.61), rather than spinors. How will Eq. (14.68) be modified? 


. Prove the mass-independent MS prescription for y in Eq. (14.134). 
. Solve the Callan—Symazik equation in the deep Euclidean region. 


. Derive the analogue of Eq. (14.44) if we include the b and ¢ quarks. 


Consider the effect of the scale transformation x“ — A.x.4 ona massless scalar 
field. The variation of a scalar field is given by: 


bb(x) = (1+x"d,)b (14.161) 
Prove that the variation of the Lagrangian is given by: 
6D = ay(x* ZF) (14.162) 
and that the Noether current is given by: 
JY ax TH + 5ot(e) (14.163) 


If the scalar particle has a mass, show that the trace of the energy-momentum 
tensor is proportional to the mass squared. 


. Consider the generators of an algebra given by: P* = y*/R and M“” = 


(1/2)o"". Show that these generate the algebra O(4, 1), the de Sitter group. 
In the limit of R — oo (i.e., in the limit that the de Sitter sphere approaches 
ordinary space-time), prove that these generate the Poincaré algebra. (This 
is called the Wigner—Inénii contraction.) 


Now consider the algebra generated by: 
Py =) 10, 
Ky = 2xx'd,— x4, 


My = (xpd — x,d,) 


oS 
\\ 


x", (14.164) 
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Prove that they have the commutation relations: 


[Pas Pl = 9 

[Ky Py] = —2(iguyD + Myv) 

[X,Y =. 3K, 

ee wee (14.165) 


Complete all the other commutators. What transformations do these genera- 
tors induce on x, space? Show that D corresponds to a scale transformation, 
and that K,, corresponds to an operator that is the product of an inversion, 
translation, and another inversion. (An inversion is given by x, — x,/x?.) 


. Show that this algebra generates O(4, 2). 


. Prove that: 


O(4, 2) = SU(2, 2) (14.166) 


which is called the conformal group. 


Discuss how to use the renormalization group to prove the renormalizability 
of QED. Set up the basic equations, discuss how the recursion relations might 
work, but do not solve. 


Complete the missing steps needed to prove Eqs. (14.131) and (14.132). 
Prove Eq. (14.60). 

Prove Eq. (14.61). 

Prove Eq. (14.64). 

Prove Eq. (14.68). 

In the background field method, we expand A,, around a classical background 
field B,, that satisfies the equations of motion: 


A, = B, +A, (14.167) 


where A uw represents the quantum fluctuations. Let us define their transfor- 
mation properties as: 


oe) 
o& 
= 
| 


OnA + g{B,, A] 


n~ la 


SA, = g[A,, A] (14.168) 
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Choose the gauge fixing term F to be: 
F = 0,A" + g[B,, A“] = D(B),, A" (14.169) 


Prove that the gauge-fixing term violates gauge invariance, but preserves 
the new invariance as defined. Prove that the Faddeev—Popov determinant 
is also invariant. Thus, the gauge-fixed perturbation theory is still invariant 
under the new gauge. Counterterms are also invariant, which vastly simplifies 
calculations. 


Part III 


Nonperturbative Methods 
and Unification 


Chapter 15 
Lattice Gauge Theory 


Unfortunately, the color gauge theory will remain in limbo unless we 
learn how to solve it and in particular get the spectrum out of it. So, 
in particular, | wish to emphasize how one might solve the color gauge 


theory to get a spectrum... 
—K. Wilson 


15.1 The Wilson Lattice 


Although QCD is the leading candidate for a theory of the strong interactions, the 
embarrassing fact is that perturbation theory fails to reproduce many of the essen- 
tial low-energy features of the hadron world, such as the spectrum of low-lying 
hadron states. Perturbation theory seems to be effective only in the asymptotic 
region, where we can use the arguments of renormalization group theory to make 
a comparison between theory and experimental data. 

Nonperturbative methods, however, have proved to be notoriously difficult 
in quantum field theory. However, one of the most elegant and powerful non- 
perturbative methods is Wilson’s lattice gauge theory,' where one may put QCD 
on a computer and, in principle, calculate the basic features of the low-energy 
strong-interaction spectrum. In fact. the only apparent limitation facing lattice 
gauge theory is the available computational power.* 

Monte Carlo methods,** in particular, have given us rough qualitative agree- 
ment between experiment and theory, giving us the hope that, with a steady 
increase in computer power, we might be able to reduce the discrepancy between 
theory and experiment. 

We should also point out that we must pay a price for putting QCD on the 
lattice. First, because the metric is Euclidean, it means that present calculations 
with lattice gauge theory are limited to the static properties of QCD. Although 
lattice gauge theory may be good for confinement and perhaps the low-energy 
spectrum of states, it has difficulty calculating scattering amplitudes, which are 
defined in Minkowski space. (One can, in principle, make an analytic continuation 
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from Euclidean to Minkowski space, but then we need to have much greater 
computational power than what is currently available.) Second, lattice gauge 
theory explicitly breaks continuous rotational and translational invariance, since 
space-time is discretized. All that is left is symmetry under discrete rotations of 
the lattice. (Presumably, we can recover continuous symmetries when we let the 
lattice spacing go to zero.) Third, we are limited by the available computational 
power. Lattice sizes are thus unrealistically small, on the order of a fermi, so 
that important effects that enter at larger distances are cutoff. However, since 
computer power is increasing exponentially, there is hope that we will one day 
soon extract a realistic spectrum from lattice gauge theory. 

Let us begin by defining the simplest lattice in four dimensions, a Euclidean 
hypercubical lattice with equal lattice spacing a in the x, y, z, and ¢ direction. If 
we take the limit as a — O, then our action should reduce to the usual Yang—Mills 
action. 

Between two neighboring sites of the lattice, we define a “string bit” or “link,” 
which is a member of SU(3) and is denoted by U(n,n + fi). This string bit 
connects the nth point with the n + & point, where / defines a direction in the jth 
lattice direction. 

We define this string bit or link to be unitary: 


U(n,n+ pf) =U(n,n+p)7! =U(n+ fn) (15.1) 


Taking the inverse of a link therefore reverses its orientation. Since a unitary 
matrix can be written as the exponential of an imaginary matrix, we can write: 


INE 
U(n, n+ ft) = expiag Aj, (n) (52) 


where g is the coupling constant, A? the generator of SU(N), and Av (n) is the 
gauge field. 

We define a plaquette as a square face of the lattice with dimensions a x a 
(Fig. 15.1). Our action is equal to tracing Us around each of the squares of the 
lattice: 


S = a Uwe 
P 
U, = Uln,n+AU(nt+f,n+fA+bUn+A4+5,n+d)U(n+5,n) 


(1533) 


where we symbolically sum over all plaquettes p in the four-dimensional lattice, 
and where with each p we associate the point (n, fi, /). The essential point is 
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Figure 15.1. A plaquette on a Wilson hypercubical lattice. The action is defined as the 
sum of traces over these plaquettes. 


that this formulation is both gauge invariant and also reproduces the Yang-Mills 
theory in the continuum limit. 
A gauge transformation is defined as: 


U(n, n+ fi) > An)U(n, n+ PAN + pa) (15.4) 


Notice that a string bit is defined between two neighboring sites, while the gauge 
parameter (Q(n) is defined at a lattice site. Our action is invariant under this 
transformation, since every {2(n) in the transformed action cancels against an 
i= (aan Tr'U,. 

Finally, we take the continuum limit. To do this, we must use the Baker— 
Campbell—Hausdorff theorem to combine each of the Us in a plaquette into a 
single exponential. We use the equation: 


eek = eAtB+iIA.B]+--- (15.5) 


In general, we have an infinite number of terms appearing in this expansion, 
corresponding to all possible multiple commutators between A and B. However, 
because we take the limit as a — 0, we need only keep the first-order terms in 


this expansion. 
For example, if we keep the lowest-order terms and drop all commutators, we 


find terms like: 
7d bi hee. a 
expiag = [A,(n + ft) — A,(n)---J* — expia’g(A*/2) [a Av(n) : | (15.6) 
Putting everything together, we find that we can write the action as: 


S= -5 5 pe exp [ia’g” Fyy(n) +--+] (15.7) 
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where A,,(1) = Anhii2: and: 
Fyn) = dp Ay(n) a dy Ay (1) a ig[A,(n), A,(n)] (15.8) 


Taking the limit as a — 0, then we find the continuum result: 


nw ; / cb ay Oey (ade (15.9) 


We thus recover the continuum theory in the a — 0 limit. 


15.2 Scalars and Fermions on the Lattice 


We have placed gauge particles on the lattice in an elegant fashion, preserving 
exact gauge invariance on the lattice, even with finite spacing. We now generalize 
these results to put scalars and fermions on the lattice. In particular, we will find 
curious complications when fermions are introduced. 

To put scalars on the lattice, we must make the substitution: 


1 
Oud = 3 Pati = gn) (15.10) 


With this simple substitution, we find that the scalar action becomes: 
1 1 Xr 
S = | d*x (5a.0a"e + 5m oe + io’) 
24 2 
a 2, ,4(™ 12, A 44 
= ) I$ D One =O, +a (0 + wt) Cioabd) 


To calculate the propagator of the scalar particle on the lattice, we will find it 
convenient to go to momentum space. We wish to replace ¢, with its Fourier 
transform @(k). We will define: 


_ fae 
~ J) eye 


dn e'" O(k) (15.12) 
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We will arbitrarily truncate the integral, since wavelengths smaller than twice the 
size of the lattice can be discarded. We will take: 


ahs 


i, = = (15.13) 


Now let us insert the Fourier expansion of ¢,, into the free action of the scalar 
field on the lattice. The free part can be calculated by taking a double integral 
over k and k’: 


a’ d*k i(k+k’)-n 7 iak iak’ 
pas Qnyt sae eT ae 
d*k 
ony (eitku ae 1)(e7 ak Sq) 
4 
4 / a sin’(ak,, /2) (15.14) 


Inserting this back into the free action, we now have: 


leaf ae a 7 
Saas (aa a as (ak,,/2) +m? | o(—k)p(k) (15.15) 


Not surprisingly, this differs from the usual propagator defined in momentum 
space. Normally, the Euclidean Klein—Gordon equation has a propagator given 
by 1/(k? + m2). On the lattice, the propagator is generated by taking the inverse 
of: 


4 
K? +m? — m? +) 1 sin? (aky/2) (15.16) 
pb 


In the limit as a — 0, we find that the two expressions are identical (for small k). 
Both are parabolic, as shown in Figure 15.2. (For large k, the two expressions 
differ noticeably. However, large values of k are cut off.) 

The relative ease with which we could put scalar particles on the lattice 
compares with the relative difficulty of placing fermions, especially quarks, on 
the lattice. A number of problems, both conceptual as well as computational, 
arise. 

As before, we make the substitution: 


1 
Oy =y qt vnta — Wn) (15.17) 
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Ta 


Figure 15.2. For small k, both the lattice and continuum inverse propagators behave like 
k? +m’. For large k, where the lattice approximation is not reliable, they differ. 


With this substitution, our lattice fermionic action becomes: 
ae 2 
S= 2 (§ 2, Fav ren — Vn—p) + ma WnVn (15.18) 


As before, we take the Fourier transform of the y,, field. This gives us the action: 


dik. sin(ak,,) 
S= one ( ee +m) wk) (15.19) 


Therefore, we wish to examine the properties of the expression: 


= sin’ ak,, +m? (15.20) 
Unfortunately, this has bad behavior as we take the continuum limit. In Figure 
15.3, this expression contains two equal minimum within the Brillouin zone. One 
is located at k = 0, as before. However, we also have the minimum located at 
k= sata. 

Therefore, we have an unphysical doubling problem; that is, the lattice fermion 
theory does not give us the correct continuum limit. In fact, since we have a 
doubling for each space-time dimension, we actually have 2* = 16 times too 
many fermions. 

Several solutions have been proposed to cure this problem, none of them with- 
out some drawbacks. One convenient solution to the fermion doubling problem 
is to modify the lattice fermion action by hand, which can cancel the unwanted 
zero. We can always do this as long as the correct continuum limit is obtained. 
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Figure 15.3. Fermion doubling problem: For small &, both the continuum and lattice 
fermion inverse propagators behave the same; for larger x, the lattice fermion propagator 
has other minima. 


We add to the previous action the following Wilson term: 


ee 
54 Un (Ynea + Vn—a — 24) (15.21) 


If we now calculate the momentum-space contribution of this term and add it 
to the previous one, we find: 


4 rs 
$= Ss = (-k) [ bE yl a +m — s sl Ea os ql Wik) (15.22) 
The second term, containing the cosine, preserves the original minimum at k = 0 
but eliminates the unwanted one. 

The existence of this fermion doubling problem is related to the anomaly 
problem. As we saw previously, the various regulator schemes that we have 
studied for Feynman integrals violate chiral symmetry. For example, Pauli— 
Villars regularization adds infinite mass fermions, which violate chiral symmetry. 
Similarly, dimensional regulation has problems because of the presence of ys, 
which cannot be generalized in d-dimensional space. Thus, chiral symmetry is 
not respected by these regulator schemes, and hence an anomaly arises. 

On the lattice, however, chiral symmetry is exact for massless fermions. 
Since chiral symmetry is respected by the lattice theory, there can be no anomaly. 
However, there is a price we pay for the absence of the anomaly, and this is the 
doubling of the fermion chiralities such that the anomaly cancels. If we calculate 
the chiralities of the two types of fermions, we find that they are opposite and hence 
produce no anomaly. The fermion doubling problem is thus deeply connected with 
the problem of chiral symmetry breaking on the lattice. For example, adding the 
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Wilson correction term violates chiral symmetry, even for zero mass fermions. 
Thus, studying chiral symmetry breaking on the lattice is always a bit delicate. 

An even more difficult problem, from the point of view of computation, is the 
problem of quark loops. This is because Grassmann variables cannot be modeled 
on a computer. We cannot use Monte Carlo methods to minimize an action with 
Grassmann variables. 

However, we can functionally integrate out the fermion contribution entirely, 
yielding determinant factors. For example, we have: 


ii Dy Dye J adsy = det (iy, — m) (15.23) 


These determinants, in turn, can be modeled on computer, although they are 
non-local and quite difficult to compute. This is unfortunate, since a computer 
simulation of QCD necessarily involves quarks. This remains one of the main 
computational problems facing lattice gauge theory. (However, calculations omit- 
ting the fermion loops, called the “quenched approximation,” exhibit many of the 
nonperturbative features we expect in the final theory.) 


15.3. Confinement 


One of the main reasons for introducing lattice gauge theory is to calculate effects 
that lie beyond the reach of perturbation theory, such as the confinement of quarks. 
Although quark confinement has not been rigorously proved within the framework 
of QCD, we provide compelling reasons for believing that quarks are confined in 
lattice gauge theory. 

In general, if the potential between two quarks is proportional to the distance 
between them, then the two quarks can never be separated: 


Confinement: V(r)~or (15.24) 


where o is called the string tension. If we try to separate the quarks by force, 
then the restoring force of the linear potential between them grows sufficiently 
rapidly to prevent them from being separated. Furthermore, the string may break, 
creating a quark—antiquark pair held together by another string. Thus, they can 
never be separated if they are bound by a linear potential. Similarly, if the quark 
potential asymptotically becomes a constant or decreases with distance, then the 
potential is not sufficient to confine the quarks. 
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We will now see that, in the strong-coupling limit, the lattice gauge theory 
confines quarks. Let us first set up the Wilson loop: 


&(C) = P Tre! fc Aud (15.25) 


where C represents a closed loop and where P represents the path ordering of the 
exponential along the loop. (This means that we cannot simply perform the line 
integral dx“ along the loop. We must first split up the line integral into an infinite 
number of infinitesimally small exponentials, and take the product of them ordered 
sequentially along the loop.) Notice that the Wilson loop is gauge invariant. 

We shall primarily be interested in the Wilson loop because it gives us a 
criterion for confinement. The counterpart of the Wilson loop for the lattice is 
given by: 


Wie ie fil o,) (15.26) 


where we take the product around a discretized loop C. We will be interested in 
the behavior of W(C) where C is a rectangular loop with width R in one spatial 
direction and length T in the time direction, in the limit of large T. 

Our strategy is to rewrite the path-ordered Wilson loop W(R, T) in terms of 
the matrix elements of gauge-invariant, two-quark states. The two-quark state at 
time ¢ is given by: 


(t,R) yi 


oe 
lat, Oq(t, R)) = ° FCG, 0)Pc exp (i / Sade] q(t, R)|0) 
Cc ( 


t,0) 


T(t, R)|0) (15.27) 


where the quark states are at equal times but are separated by a spatial distance R, 
where Pc takes the path-ordered exponential along the path C, and f(C) is some 
function along the path C. The sum over C is taken over all paths that connect 
the two points. The presence of the path-ordered exponential guarantees that this 
two-quark state is gauge invariant. 

Now construct the overlap function by taking the matrix element of the two- 
quark state at time t = 0 and at some later time t = T. After inserting a complete 
set of intermediate states, and taking the limit T — ov, we find: 


jim QCEYR) jim (Ola (7. 0)q(T, R)|Gg(O, 0)q(O, R)|0) 


jim (o[ (7, R)FCO, R)|0) 


514 Lattice Gauge Theory 
= li T+, R)|0)|e~ "7 
a ie (0, R)|0)|"e 
en (15.28) 


(In the T —+ oo limit, only the smallest energy eigenvalue Eo dominates the 
right-hand side of the equation.) 

Now let us contract the quark wave functions that appear within (Q(T, R). The 
quark propagator (in a background of gluon fields) can be approximated as: 


Ao(t, a) (Olq%(t, x)g*(t’, x)|0)o 


ft 
(Olq*(t, x)g7(t’, x)|0) = exp ( i 
t 
(15.29) 
where the subscript 0 refers to free quark fields. 

Within 2, we contract the quark field g(0, 0) with 9(T, 0), and the quark field 
q(0, R) with g(T, R). With this contraction, we find that there are now four 
contributions to the path-ordered exponential integral. These four contributions 
complete a path-ordered integral over the sides of a closed rectangle, whose 
vertices are given by (0, 0), (0, R), (7, 0), and (T, R). Because the exponentials 
are now taken over a closed loop, 2(7, R) is thus proportional to W(C). We have, 
therefore: 


lim Q(T, R) ~ lim W(T, R)~ e7 ©? (15.30) 
T—oo T—0o 


If the potential between the quarks grows like the distance of separation R, then 
the quarks are confined. We therefore have: 


W(R, T) > exp(—o RT) (15.31) 
Since the area of the Wilson loop is RT, the area law for the behavior of the 
Wilson loop for large T gives us confinement. 
If the quark potential goes to a constant m for large distance, then we have: 


W(R, T) — exp(—mT) (15732) 


which gives us a perimeter law for nonconfining potentials. 


15.4 Strong Coupling Approximation 


Since the area law gives us a criterion for confinement, our next task is to calculate 
the functional integration in the path integral for gauge theory to see if QCD gives 
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us a confining theory. We wish to calculate the functional integral of the Wilson 
loop for small 1/g? (i.e., the strong coupling approximation): 


(W(C)) = z f ov Tr (U, U2 --» Un) exp (33 er v) (15.33) 
P 


where U; U2 --- Uy U; symbolically represents the product of a series of U matrices 
around some closed path C in the lattice. 

In order to perform this integration in the strong-coupling limit, we use the 
invariant group integration dU introduced in Chapter 9. We recall that if U is an 
element of a Lie group, then the invariant measure dU obeys the property: 


dG) = dU (15.34) 


for fixed U’. dU is easy to construct for SU(2). For this group, we can reduce 
the string bit to: 


~ lajtio-a (15.35) 
where: 
6;; = =galA? (15.36) 


and aj) +a-a= 1. Then it can be shown that the invariant group measure for 
SU(2) is given by: 


3 
dU = 1~*dag da, dap da 5 (2 co ) (15.37) 
je40) 


With this explicit representation of the group measure, we can easily prove: 


/ du U;, =—0 
1 
fw ViiVinn = 7€ii€inn 


1 
fav uw w = 58d: (15.38) 
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With these integrals, we can perform the strong coupling expansion of the 
Wilson loop. First, we reduce the plaquette to: 


o? 


Up ~ expligaFy,>) 
~ ILcosO@p+io-npsinOp (15.39) 
where: 
l a 
Op = 584 |Fis| (15.40) 


The action for the lattice gauge theory then becomes: 
1 1 1 
(i — = IrUp) = — (1 — cose) (15.41) 
8 2 g 
Then the expectation value of a Wilson loop becomes: 


(wo) =z f buTr (UUs Unyexp = Yi = 5 €08 6p) (15.42) 
P 


In the strong-coupling limit, we want to power expand this expression in terms 
of 1/g?. If we expand the exponential, to lowest order we have: 


1 
(WiC) = 2-1 f DU THU Uy) ( = Dh) (15.43) 


Because of the identities in Eq. (15.38), the functional integral is zero unless 
each U within a plaquette is paired off the same U appearing elsewhere in the 
integral. Unless the pairing takes place, the resulting integral is zero because 
f{ dU U =0. The pairing can be performed in two ways: U can pair off with the 
same U appearing within the Wilson loop, or with the same U appearing within a 
neighboring plaquette. 

This stringent condition sets almost all the terms in the integral to zero; the 
only nonzero contribution comes from plaquettes that completely fill the two- 
dimensional space within the Wilson loop. This effect is called “tiling”; that is, 
the only nonzero contribution comes from the plaquettes arranged like tiles within 
the loop. Each plaquette borders another plaquette, or borders the Wilson loop. 
In this way, each U appearing in the integral appears twice, either in neighboring 
plaquettes or in a plaquette and the Wilson loop. 

The functional integral is therefore proportional to the number of integrations 
that we have performed; that is, it is proportional to A/a’, where A is the minimal 
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area of a surface that fills up the loop C. The integral can be approximated in this 
limit, and the strong-coupling expansion gives: 


Af (g) 
2 


(W(C)) ~ exp |- 
a 


(15.44) 


where f(g) = log g? to lowest order. 

The crucial point is that this trace goes as the exponential of the area of the 
enclosed loop. We have, therefore, formally proved that the strong-coupling limit 
produces a confining theory for the simplest SU(2) gauge theory. 

A curious phenomenon occurs, however, when we try to reach the continuum 
limit in the strong-coupling approximation. In the continuum limit, we need to 
keep: 


ie = constant (15.45) 
If this condition is not met, then the trace formula becomes singular and mean- 
ingless. Therefore, after taking the strong-coupling approximation, we cannot 
take the continuum limit. Although this seems to be a problem, it is actually a 
blessing in disguise, because the discussion we have just made applies to QED, 
which we know is not a confining theory in the weak-coupling regime. Thus, we 
wish to have a phase transition separating the weak- and strong-coupling regimes 
for QED. 

However, for gauge theory, we do not want a phase transition separating these 
two regimes, because we want a theory of confinement for the quarks. Here we 
see the crucial role played by non-Abelian gauge theory; QED has a qualitatively 
different phase structure than non-Abelian gauge theory. 

All these results can be generalized to SU(3) and higher. For SU(3), we need 
the identities: 


/ dU Oya = 0 
1 
i dU OnnWhq = 58ma%np 
fw (U)mnU)pq = 0 (15.46) 


Then the calculation proceeds as before, giving us the area law and hence a 
confining theory for SU(3) lattice gauge theory in the strong-coupling limit. 
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15.5. Monte Carlo Simulations 


So far, our results have been qualitative and not quantitative. One of the great 
advances in lattice gauge theory is the Monte Carlo method, where we can use 
supercomputers to calculate a large number of numerical results for QCD. 

A brute force calculation of the path integral, of course, is out of the question. 
If we have the simplest possible group defined on the lattice, Zz, with elements 
+1, and if the lattice is 8 x 8 x 8 x 8 in size, then the sum contains the following 
number of terms: 


22" = 716384 195460 (15.47) 


Clearly, this is prohibitive. The Monte Carlo technique, however, evades this 
problem by making certain approximations to the path integral. 

The path integral, in general, sums over an enormous number of configurations 
that contribute almost nothing to the integral. We wish to throw most of them away, 
while keeping the ones that tend to minimize the action. The Monte Carlo method 
gives us a specific algorithm by which to accept only these gauge configurations. 

Let %| be a certain set of initial values for each of the various links for 
the entire lattice (say, each link equals one). Then the Monte Carlo method 
generates a sequence of configurations 2, 43, .... When statistical equilibrium 
is eventually reached, the probability of encountering any specific configuration 
© in the sequence is proportional to e~4*°. Then the expectation value of any 
observable O may be approximated as: 


m+n 


(0) ~ y> Oi) (15.48) 


i=m+1 


where O(%;) represents the average of O computed with the set of link variables 
{%;}, and where the first m steps have brought the system near equilibrium. 
Notice that we have replaced the original sum over all gauge configurations with 
this smaller, streamlined sum of configurations {%; } near equilibrium, which give 
us the bulk of the nonvanishing contributions to the path integral. 

There are several useful algorithms, such as the heat bath and the Metropolis 
methods, which can generate this sequence of configurations {%7;}. We will use 
the latter. If we make the change from © to X’ (by changing the value of just one 
link), we can compute the corresponding change in the action: 


AS = S(X’) — S(S) (15.49) 


Now comes the key step: choose a random number r between 0 and 1. If 
e~ 4° > r, then the change from © to ¥’ is accepted. If not, it is rejected. 
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If AS is negative, then the change is always accepted since e~4* > 1. 
However, if we only accepted negative values of AS, then we would always be 
decreasing the action and hence would tend toward the classical equations of 
motions. This, of course, throws out all quantum mechanical corrections and is to 
be avoided. 

By choosing this random number r, we are allowing positive values of AS, so 
that the action can actually increase as we make the transition from © to b’. This, 
in turn, allows for quantum fluctuations around the classical equations of motion. 

Now make a change in another link, generating yet another configuration, and 
test to see if it meets the proper criteria. In this way, we can sweep through the 
entire lattice, making small changes successively in each link. Once we have 
swept through the entire lattice, the process is repeated once again. After many 
sweeps, we gradually reach thermal equilibrium, yielding the set of link variables 
{=,}. Then the process is repeated once again until, after many sweeps, we obtain 
the second set of link variables {©}. Over time, we arrive at a sequence of {5; }, 
which we then insert into the sum in Eq. (15.48). 

The net effect of this algorithm is that the new configuration &’ is accepted 
with the conditional probability of e~4°. To see this, let P(X — 5’) be the 
probability of making the change from © to &’. Then this algorithm gives us: 


1 if S(Z) > SD’) 
POEL )=} ge (15.50) 
eS if SO) < SQ) 


This can also be written as: 


/ 
Pt) _ o- SE’ 4S(D) (155)) 
Pa) 


But there is something that still must be checked: Is this algorithm sufficient to 
force the system into thermal equilibrium? 

The advantage of this iteration process is that it does, in fact, automatically tend 
toward thermal equilibrium. To see this, let us review what we mean by thermal 
equilibrium. The transition matrix P(£ —> »’) satisfies the usual properties of 
stochastic matrices: 


I 
_ 


S Ro S) 
By 
P(xu—bd’) > 0 (15352) 
(The first statement simply says that probability is conserved, i.e., that the sum of 


probabilities for the transition to all possible configurations is equal to 100%. The 
second statement says that the probabilities are never negative.) 
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Next, we want to have the system in thermal equilibrium. We can consider 
P(X — D’) to be a square matrix, with elements labeled by {x}. We demand that 
the Boltzmann distribution e~*™ be an eigenvector of this transition P matrix: 


Seen ete > v) =e? (15.53) 
{5} 


(This simply means that, if the system is already in equilibrium, then the transition 
from © to X’ leaves the system in equilibrium.) 

We can also show that these three conditions are consistent with the detailed 
balance equation: 


! = O09) 
Oa mi (15.54) 
PQ’ +d) eS) 

To prove this consistency with the detailed balance equation, we can remove the 
denominators by cross multiplying and then summing over &. This gives us: 


ier PO a) = es er) 
{2} {5} 


e 52) (15.55) 


where we have used Eqs. (15.52) and (15.53). 

This shows that the detailed balance equation is a sufficient (but not necessary) 
condition to prove thermal equilibrium. However, if we compare the detailed 
balance equation with Eq. (15.51), we find that the Metropolis algorithm satisfies 
this condition, and hence one can show that the algorithm drives the system to 
thermal equilibrium, as desired. 

Once we have reached equilibrium and have generated a sequence of these 
configurations, we can calculate many numerical values for physical parameters. 
The simplest and most convenient is the string tension. By analyzing the behavior 
of the string tension, we can rapidly get an indication of the existence of a phase 
transition. 

If two quarks are indeed linked together by a thin, condensed glue of gauge 
fields that behaves like a string, then it should be possible to calculate the tension 
on that string with these methods. 

Let W(R, T) describe the Wilson loop, as before, and define the string tension 
as: 


(15.56) 


sae eC PWR 1H 
~  °\W(R-1, TW(R, T — ot 


15.5. Monte Carlo Simulations §21 


0.1 


+—j 1 
0.5 1.0 IES 


I/g? 


Figure 15.4. Monte Carlo simulation of a SU(3) lattice calculation, with string tension 
plotted against 1/g7. As long as the string tension is nonzero, we have confinement. 


Now insert the value of W(R, T) ~ e— ©" into this function. We find: 


dE o(R) 
o ~ a ——— 


IR (13.57) 


which very conveniently gives us the force between the quarks for all values of R. 

By plotting the string tension o versus 1/g7N, we can see in what region the 
area law is satisfied or violated. In Figure 15.4, for example, we see a typical 
result from a Monte Carlo calculation for SU(3). We plot the string tension o on 
the vertical axis, and B ~ 1/g* on the horizontal axis. 

Finally, we remark that since Monte Carlo methods for SU(3) are slow and 
cumbersome, it is instructive to analyze simpler groups, such as Z,, defined at 
each site. 

Monte Carlo studies indicate that these systems with n = 2, 3,4 exhibit a 
two-phase structure. However, forn > 5, the theory seems to prefer a three-phase 
structure. One phase corresponds to a confinement phase. Another corresponds 
to a phase where we have spin waves and free photons. The third phase is peculiar 
to systems with discrete groups only. 

In the limit as n — oo, the Z, models approach U(1) gauge theory. In this 
limit, one phase of the Z,, model shrinks to zero, leaving only two phases for large 
n (and presumably QED). 
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15.6 Hamiltonian Formulation 


Since Lorentz covariance is manifestly broken on the lattice, it is worthwhile to 
investigate the canonical formulation of lattice gauge theory as a Hamiltonian 
system, where the time parameter ¢ is continuous. 

As before, the continuum Yang—Mills action can be written as: 


eee 
L= pq — H(p,q) = Et A? - 5 fas ((E?)’ + (FZ)°) + EGG? = (15.58) 


where G? is the generator of gauge transformations. 

We wish to put only the three space directions on the lattice and keep the 
time component continuous. In the lattice limit, we have the Kogut—Susskind 
Hamiltonian’: 


2 
-\° £ £4 RF | DE 
He ) E; E? — hae : Tr Up +h.c. (15.59) 


subject to the constraint: 
G?|W) =0 (15.60) 


This last constraint (Gauss’s Law) forces us to choose only gauge-invariant states 
for our system. 
In this picture, gauge-invariant states include quark—antiquark states: 


v(n) (1 v) v(n+R) (15.61) 


path 


as well as glue-balls: 


m( I] 0) (15.62) 
closed path 


This means that we are immediately left with a Hilbert space consisting of strings, 
without any free gluon states. The advantage of this formalism, therefore, is that, 
to lowest order, we see only strings. Any approximation we make will be an 
approximation around string states. 

The new commutation relations are given by: 


1 
zt Um, 1); ;5n,m 


ie? E°(n, i)8;;5n,m (15.63) 


[E*(n, i), UG, j)] 


I 


[E7(n, i), E°(m, j)] 


where the lattice site is given by m, and where E“%(n, /) is the electric field. 
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Next, we wish to calculate the energy associated with these strings. In the 
strong coupling limit, we can keep only the E* term. For SU(2), we therefore 
have: 


11 
(E*YU|0) = 50°51°U|0) = “U\0) (15.64) 


We apply the Hamiltonian on a quark—antiquark state with length R to obtain 


its energy: 
Z 3\(R 
Hola) = (Z) @ (=) a9) (15.65) 


Thus, the energy of a string state, to lowest order, is proportional to its length, 
with a string tension given by 3¢7/8a*. The Hamiltonian formulation of lattice 
gauge theories gives us a quick way in which to see confinement and calculate the 
string tension. 

Using operator techniques, we can calculate the string tension to any arbitrary 
accuracy. A more precise calculation of the string tension for SU (3) gives us: 


A {1 6] 
=< (5 153 y? = ms) = 0.041378 y4 = 0.034436y° ia ‘ (15.66) 


where y = 2/27. 


15.7 Renormalization Group 


We mentioned earlier that the continuum limit a — 0 is a subtle limit requiring 
renormalization group techniques. This is because the lattice spacing is a regulator 
on the potentially divergent structure of the theory. To eliminate divergences and 
take an appropriate continuum limit, the coupling constant g must be taken to 
depend on a. 

For example, let O be a physical observable with dimension d. Since there 
are no intrinsic dimensional constants in the theory other than the lattice spacing, 
then by dimensional arguments we can write: 


O =a~*r(g) (15.67) 


where r is some function of the coupling constant g. The limita — 0 is ambiguous. 
For example, for negative d, r(g) must become singular in order to have a finite 
result. 
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A mass m, for example, must obey: 
1 
m= 7168) (15.68) 


Demanding that m be independent of a in the continuum limit leads to: 


CLL (15.69) 
da 
Differentiating, we arrive at: 
r'(g) = —r(g)/B(g) 
dg 
= -a— 15.70 
B(g) aan ( ) 


where f is the usual Callan—Symanzik function found in renormalization-group 
theory. 
We know that, for SU(N) theories, the behavior of 8 is given by: 


gee Seng? (15.71) 
da 
where: 
2 
ll f aN 34 (N 
ie es Ve | 15.72 
BO gs (se) Pi=% (se) CB 
for SU(N) theory. We can then integrate the expression for r(g): 
8 dg! ) 
r(g) =exp { — (i773) 
: r( | Bg") 


which becomes: 


r(g) ~ (Bog?) ~P:/65 exp (- ) (15.74) 


2 Bog? 


In the limit a — 0, we take g — 0, so that r(g) goes to zero in such a way that 
m is finite and nonzero. In this way, masses can develop even in a theory with 
no dimensional parameters. This is an example of “dimensional transmutation,” 
where massless theories develop a scale because of renormalization effects. 

In summary, lattice gauge theory gives us perhaps the best hope of extract- 
ing the low-energy hadron spectrum from QCD. Approximations to lattice gauge 
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theory, such as the strong-coupling approximation, indicate that quarks are con- 
fined, as expected. Furthermore, simple computer calculations with Monte Carlo 
programs show many of the qualitative features associated with nonperturbative 
phenomena, such as phase transitions. 

Some of the important problems facing lattice gauge theory include how to go 
from the Euclidean metric to the Minkowski metric, how to calculate with quarks 
on the lattice, and how to increase the computational power of our computers. 
Although lattice calculations have not yet given us the mass of the proton or other 
physical parameters of the low-lying hadron states, the qualitative features of the 
theory are all in agreement with our expectations. The only limitation seems to 
be the level of our current computer power. 


15.8 Exercises 


1. Complete all intermediate steps necessary to prove Eq. (15.7). 


2. Prove that the Wilson fermion correction gives us the propagator in Eq. 
(15:22): 


3. Let U be an element of SU(2), parametrized as U = ay tio. a. Prove that 
1-9 @? = 1. Define the measure as follows: 


dU =x~*d‘a8(a? — 1) (15.75) 


Prove that the measure is invariant by multiplication with another element of 
St): 


d(U'U) =dU (15.76) 


for fixed U’. 
4. For SU(2), prove Eq. (15.38). 
5. Define the quantity: 


WJ) = it dU exp(JU) (15.77) 


For SU(N), prove that: 


) ) ) 
en Ui eee = see WU 15.78 
fu Visi Vinin © Vin in SFG OS pin 8S juin (J) J=0 ( ) 
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and that: 
det a WH=1 (15.79) 
a ~ 


[Hint: use the fact that SU(N) matrices have unit determinant. ] 
6. One can show that an explicit value for W(/) is given by: 
Se (Ney 
W(J)= Dee ; (ition Jini Sinn? Jin jv 


~ i!---(i+N — 1)! 


eS 


(15.80) 
Using this, prove that: 
1 
fw Ui, j, go Opeie = Wi civin Eft iv (15.81) 
7. Evaluate: 
fw rec) (15.82) 


for the SU(N) matrix U. 


8. To construct the lattice version of the Bianchi identities, one must trace over 
two plaquettes. Construct this trace, and show how to reduce down to the 
usual Bianchi identity. 


9. Consider the Z, model in d dimensions, where the spins on the lattice can 
only equal +1. The partition function Z and free energy F are given by: 


ia ss exp (: = “| = exp NF(p) (15.83) 
Pp 


ofSaell 


where 0, is the product of the spins around a plaquette. For large 6, the sum 
over plaquettes can be performed. We find: 


nad—)/2 


(cosh B)~ Z= De 1/5 


closed surfaces 


1+(N/6)d(d — 1)(d — 2)1° 
+(N /2)d(d — 1)(d — 2)(2d — 5)t!° +--+ (15.84) 
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10. 
Le 


12. 


13. 


14. 


and: 


et OE a cosh B + a — 2)[tanh 6° + (2d — 5) tanh B'° +---] 
ma—1) 2 2 
(15.85) 
for ¢ = tanhf. The first term corresponds to summations over cubes, the 
second to adjacent cubes, the next to disconnected cubes, etc. Prove these 
two strong-coupling relations to second order only. Hint: use the fact that: 


exp Bo, = cosh B(1 + to,) (15.86) 


Prove Eq. (15.64). 


In order to have the commutation relations in Eq. (15.63), what must the 
complete lattice Lagrangian look like in terms of independent variables and 
their conjugates? 

Draw the graphs necessary for the calculation of Eq. (15.66) to second order. 


Do not solve. 


The lattice gauge action makes no mention of gauge fixing, yet all integrals 
are well defined, without any infinite overcounting. How does the lattice 
gauge action accomodate gauge fixing? 


Discuss how lattice gauge theory might be formulated on a noncompact group. 
Discuss some of the problems. 


Chapter 16 


Solitons, Monopoles, 
and Instantons 


I was observing the motion of a boat which was rapidly drawn along 
a narrow channel by a pair of horses, when the boat suddenly stopped, 
[creating] a large solitary elevation, a rounded, smooth and well-defined 
heap of water.... I followed it on horseback, and overtook it still rolling 
on at a rate of some eight or nine miles an hour, preserving its original 
figure... after a chase of one or two miles, I lost it in the windings of the 


channel. 
—J. Scott Russel, 1834 


16.1 Solitons 


Perturbation theory is based upon making power expansions of the path integral 
around trivial vacua such as ¢@ = O or const. However, there are solutions of 
the classical, nonlinear equations of motion that exhibit particle-like behavior 
that give us powerful insight into the nonperturbative behavior of these theories. 
A new quantum power expansion can be developed around each exact solution, 
allowing us to explore regions that are not accessible by standard perturbation 
theory. In particular, these solutions give us nonperturbative information about 
important physical phenomena such as tunneling and bound states. 

In this chapter, we will describe three different types of classical solutions that 


have been intensively studied: 


1. Solitons are finite-energy, localized solutions to the equations of motion that, 
after collisions, retain their shape. They were first investigated by J. Scott 
Russel! in 1834. Since then, a large number of different wave equations have 
been shown to possess soliton solutions, especially in two dimensions. 
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2. Monopoles, or particles with magnetic charge, were first investigated by Dirac. 
They have been found in gauge theories with spontaneous symmetry breaking 
and may have cosmological significance. 


3. Instantons are finite-action solutions to the Euclideanized equations of motion. 
Their presence signals the possibility of tunneling between degenerate vacua. 


One of the surprising features of these solutions is that they can be studied 
using topological methods.” In topology, two geometric surfaces are considered 
equivalent if they can be smoothly deformed into each other without cutting. 
For example, a coffee mug, an inner tube, and a doughnut are all topologically 
equivalent. We will find that certain topological numbers can be assigned to these 
classical solutions. 

Solitons (for solitary waves) exhibit some unusual properties, providing a 
laboratory in which we can test some of our ideas about bound states. Eventually, 
the hope is that we can extrapolate some of the qualitative features of solitons 
to describe more complex bound-state systems, such as the proton. Their main 
distinguishing feature is that, after they have scattered against each other, they 
retain their shape (although there is a phase shift after the scattering). They are 
hence stable against collisions and perturbations. 

To exhibit soliton solutions, let us begin with a two-dimensional relativistic 
Lagrangian: 


tae 1 
P= 5 (oy - 56’) — U(¢) (16.1) 


where U(@) is some arbitrary potential function. Its classical wave equation is 
given by the Euler-Lagrange equations: 


* ie aU 
PROG = (16.2) 
The energy is given by: 
‘a l 2s 
E=[ ax(5@7+ 307 +u@)) (16.3) 


We can find solutions of the equations of motion by solving them for the static 
case and then boosting them with a Lorentz transformation. For static solutions, 
we can set @ = 0. If we multiply the static equations of motion by ¢’, we have: 


dU@) 


ad p (16.4) 


o” ¢' = 
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These equations can be integrated over x, yielding:. 

1 4y2 

5 yal (16.5) 


Taking the square root, this equation can be integrated once again to yield: 


G(x) 7 
X—xXx9= =| go (16.6) 
¢ 


(xo) ¥2U(¢) 


Let us now give some concrete examples of soliton solutions. 


16.1.1 Example: ¢' 
Let us choose the potential: 
ee) 2742 
UO) = 4 — m'/d) (16.7) 
The potential has two degenerate minima, given by the values: 
od =tm/Vd (16.8) 


This means that soliton solutions, if they exist, must asymptotically tend toward 
these values as x —> too; that is: 


(|x| = 00) = +m/Vd (16.9) 


To solve the system, we can integrate this $* theory to yield: 


G(x) do 
—xYyit eee 16.10 
ee i Vi/2(G? — m?/d) ey 


Inverting, we then find: 
(x) = +(m/V%) tanh | (m/V/2)(x — x0) (16.11) 


The + sign in front indicates that there are two solitary waves, which are sometimes 
called the “kink” and “‘antikink” solitons. This solution approaches the asymptotic 
solution @ = +m/./2 as it should. 

The energy density is then given by: 


E(x) = (m*/2A) sech’ [m(x — x0)/V2| (16.12) 
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(x) 


Figure 16.1. A soliton solution in ¢* theory in two dimensions. 


The mass of the soliton is given by the integral over the energy density: 
Me=z E(x) dx = ——— (16.13) 


Because the system is fully relativistic, we can obtain the time-dependent 
solution by simply boosting the static one. This gives us the soliton moving at 
velocity u (Fig. 16.1): 


m m ((x—xo)—ut 
= ee ee 16.14 
ote.t= Fenn (Ae) —" 


Then the energy of the soliton is given by: 


E 


—oo 


ie Ve 1 2, 
Hl ax (50) + 50°F +U@)) 


= a (16.15) 


where M = 2\/2m3/(3d). 

Perhaps the most interesting feature of the kink and antikink solitons is that 
they are stable. Because of the way they extend asymptotically to infinity, it takes 
an infinite amount of energy to change the kink to the constant, vacuum solution. 
Although there are no Noether currents, we suspect that this stability, in turn, 
indicates the presence of a conserved current. 

In fact, we can define a conserved current as: 


Xr 
Jes YA wag (16.16) 
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which gives us the conserved charge: 


< m 


O=5 i dos? 2 gga sea (16.17) 


It is now easy to see that the constant solutions correspond to Q = 0 and the 
kink (anuikink) solutions correspond to Q = +1(—1). Since Q is constant in time, 
kinks with “topological charge” Q can never decay into solutions with different 
topological charge. In fact, solutions of the equations of motion can be grouped 
according to their value of Q; that is, they fall into discrete equivalence classes, 
labeled by Q. Two solutions of the equations of motion, even though they may 
look quite different, belong to the same equivalence class if they have the same 
topological charge Q. 

The concept of the topological charge (which cannot be derived from Noether’s 
theorem) will surface repeatedly throughout our discussion of solitons and instan- 
tons. 


16.1.2 Example: Sine-Gordon Equation 


A more complicated example, the sine-Gordon equation, is given by: 
| 
F = ~ a, Pag + (m*/a) [cos (ViG/m) 2 | (16.18) 
Its wave equation is given by: 
a2 + (m3/VA) sin [varmye| = (16.19) 


To eliminate some of the unwanted constants, let us make the substitution 
x > mx,t — mt, ¢ — (VA/m)d. Then the wave equation simply reads 
ag + sing = 0. 

Perhaps the most important way in which to catalog solutions of the sine— 
Gordon equation is by their topological charge. With the potential rescaled to 
U(¢) = 1 — cos @, the constant solutions with zero energy are given by: 


g=2Nx (16.20) 


where N is an integer. Therefore, all soliton solutions must, at x — -too, tend 
towards one of these constant values, labeled by an integer N. If the topological 
charge is defined as: 

1 


Jt = ep (16.21) 


534 Solitons, Monopoles, and Instantons 


then the conserved charge is given by: 


hf gta ee 
O = 52] dey, = a= lie — =e 
= N,—N) (16.22) 


where N, and N> are the integers that describe the asymptotic value of the field. 
Since Q is a constant topological charge, solitons with one value of Q cannot 
decay into solutions with a differing value of Q; that is, these solutions are stable 
for topological reasons. 
Let us now calculate the value of Q for different soliton solutions. The easiest 
one is the static case, where we have: 


o~) dg 
= oa e 16.23 
ae ja 2 sin(/2) ree) 


Inverting, and then making a Lorentz boost, we now have the solution: 


ies CA ea! [exp (SS) (16.24) 


where the +1 (—1) sign corresponds to the soliton (antisoliton) solution. 

By examining their asymptotic values, we can easily show that the soliton (an- 
tisoliton) solution has topological charge Q = +1(—1). Because of the periodicity 
of the cos function, we can add 27N to the soliton solution to generate a new 
soliton solution with the same value of Q. 

More complicated generalized solutions are not difficult to find. For example, 
the following solution represents the scattering of a soliton off an antisoliton: 


(16.25) 


tan! ( Sinhut/VT =) 
eg eA! (: cosh(x/V1 — u2) 


What is most remarkable about this soliton solution is that the individual 
soliton and antisoliton waves keep their same shape even after a collision: 


ds—a — os ( (16.26) 


V1 —u2 are 


where f — oo, where $s(¢,4) corresponds to the soliton (antisoliton) solution, and 
where there is a time delay given by: 


seat OP) +6 a )) 
| gagye |, ees 


A = [(1 — u?)/uJlog u (16.27) 
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For t — oo, the asymptotic solution is the same, except that A flips sign. 
Thus, the only difference between the asymptotic states at negative infinity and 
the states at positive infinity is that there has been a time delay of A. Otherwise, 
asymptotically it appears as if nothing happened. 

We should also mention that the two-soliton solution is given by: 


; A Eee 
usinhx/V1 s] (16.28) 


¢=4tan—! { XN 
coshut/V1 — u2 


Since this function goes from —2z to 27 as x goes from —oo to +00, this two- 
soliton solution has topological charge Q = 2. 

Many-soliton solutions can also be found using an ingenious technique called 
the Backlund transformation. Given a solution ¢9 of these equations, we are able 
to generate a new solution ¢. 

To see this, we write the sine—Gordon equation in terms of light-cone co- 
ordinates o = (x +1)/2 and p = (x — t)/2. Then the sine-Gordon equation 
reads: 


a7 


= sing =0 16.29 
Aesop sin b ( ) 


We now define the Backlund equations as: 


ia iol 

ae (¢: —¢o) = asinl ($1 + ho)] 

1a L. en )] (16.30) 
2 9g (P! * Po) =) genet 


Next, we multiply the first equation by 0/dp and use the second equation to arrive 
at: 


1 a? 1 “a 
2 d0 0p (41 — Go) = cosl5(i + Go) sinl5 (Oi — $0) 
Z 5 sings a 5 sin go (16.31) 


Thus, ¢; is a solution of the sine-Gordon equation if ¢o is. The beauty of 
this formulation is that we can now solve for ¢, in terms of go, thereby allowing 
us to generate a new solution in terms of the old one. If, for example, we plug 
in the trivial no-soliton solution @ = O into these equations, then we obtain the 
one-soliton equation found earlier. The equation for ¢ reads: 

1 og; _ 


Laat pet 


1 
= = asin(¢,/2 16232 
2 do a dp ome?) ( ) 
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which is easily integrated back to the one-soliton solution, with u = (1 — a’)/(1+ 
2 
aq). 


16.1.3 Example: Nonlinear 0(3) Model 


Let us start with a triplet of scalar fields ¢7 with the simple O(3) action: 


F = F0,9°VO"9) = 5 aub O0 (16.33) 


This is the usual linear O(3) model, except we impose a nontrivial constraint: 
(ba) =$- P=) (16.34) 


We can impose this constraint by using a Lagrange multiplier in the action: 
S—S+ / d*xi(p- — 1) (16.35) 


The energy of the system is defined as: 


E= 5 | Io + 8,6 d?x (16.36) 


(We have reversed the sign of the space derivative term in the Hamiltonian.) 

As before, let us analyze the possible soliton solutions according to their 
topological charge. We must first calculate the constant vacuum solutions, which 
then fixes the asymptotic value of the solitons. Then we construct the topological 
charge associated with each asymptotic value of the soliton. 

The zero energy vacuum solutions obey 0,,¢° = 0, so they are just constants 
pointing in some fixed direction in isotopic space ¢? = @§. As before, the soliton 
solutions at infinity must asymptotically tend to this constant isovector $9. 

The field @°(x), by definition, is a function that takes two-dimensional space— 
time, labeled by ¢ and x, into a vector @% in O(3) isotopic space. In general, 
this function therefore defines a map between points in R2 (the two-dimensional 
plane) and the space of three real coordinates $7s. 

However, as |x| — oo, this function approaches the same constant value, $4. 
Therefore x space is actually described by S2 (a sphere) since the values of the 
function at infinity are all the same, no matter where we point. In other words, we 
have replaced the plane with a sphere, where infinity has been transformed into 
the north pole. 

Furthermore, because of the constraint ae g°p* = 1, the isotopic space is 
actually a sphere S7. In conclusion, we find that the function #7(x) is therefore a 
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mapping of $2 (x, t space) onto $5 (isotopic $% space): 
Jt S2 _> AY) (16.37) 


The question is: how many distinct ways can the points of a sphere be smoothly 
mapped onto the points of another sphere? There is a theorem in topology that 
says that the topologically distinct ways in which this mapping, called zr, can take 
place is labeled by the integers: 


($2) = Z (16.38) 


(These mappings actually form a group, since we can “add” these maps by se- 
quentially iterating them.) 

To motivate this abstract mathematical result, one can study the simpler ex- 
ample of classifying the number of smooth maps from the circle S$; onto another 
circle S;. Let @(@) map the circle (0 < @ < 27) onto the circle given by the 
function ¢(0) = (27) mod 27. Construct the charge: 


Qn 
o-5- | do SO = = 1607) - 00) (16.39) 

At first, one might suspect that Q is equal to zero, because $(277) = 0, or 
that QO = 1, because #(27) = 27. However, there is also the possibility that 
o(27) = 2Nz7, where N is an integer, in which case Q = N. In this case, the 
function (8) maps the circle (0 < @ < 2yr) onto another circle N times; that is, 
it repeatedly wraps around the circle an integer number of times. Q is therefore 
sometimes called the “winding number,” and is a topological invariant; that is, 
it does not change even if we smoothly deform the function $(@), as long as 
the boundary conditions remain the same. Thus, Q is sensitive to the overall 
topological nature of the mapping $(6), not its specific value. Mathematically, we 
can say: 


7(S}) =Z (16.40) 


Each value of N, in turn, represents an equivalence class of maps. Two functions 
(0) and ¢’(@) are members of the same equivalence class or “homotopy class” if 
they have the same N. 

Returning now to the more difficult question of the nonlinear O(3) model, we 
shall find that the topological charge Q can be defined as: 


1 


Q= = / EMD - (Oud X Iv) (16.41) 
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Our task is now to prove that Q is, in fact, an integer that represents the number 
of inequivalent smooth maps from S2 to $2. Consider a sphere of unit radius, with 
the surface described by the three-dimensional Cartesian coordinates x,, such that 
). x2 = 1. Wecan also describe the same sphere with two dimensional coordinates 
0, 02, which can be polar coordinates or any local coordinates we place on the 
surface of the sphere. 

Then one can show that an infinitesimal element of surface area dS, pointing 
in the a direction is given by: 


dS, = 1 uv gabe 9% IXe 2 
— 


16.42) 
Zz 00, 00, ( 

By a direct calculation, one can show that this expression is independent of the 
specific choice of two-dimensional coordinates {o } one chooses. The surface 
area of a sphere can then be computed by contracting dS, onto the unit vector x, 
and integrating: 


4Nnr= | a5.x. (16.43) 


The integer N appears because the map x,(0), 62) may wind around the sphere 
an integer number of times. 

To make contact with the topological charge Q, the crucial step is to make the 
replacement x, — ¢,. Then Q can be rewritten as: 


1 
O = ae f tere" 8.” 8, 0° 


1 
= — |dS,¢°=N (16.44) 
4n 


The topological charge is therefore equal to the winding number, that is, the 
number of distinct ways that the points on a sphere Sz can be mapped smoothly 
onto another sphere Sj. Each N, in turn, represents a distinct homotopy class of 
maps. 

Q is also important because it appears in the self-dual solutions of the nonlinear 
sigma model. For example, consider the identity: 


| d?x [Cu £ Ew x 9,0) - (Oud + Eup X O,)] > 0 (16.45) 


(We are contracting with a Euclidean metric.) This quantity is positive definite 
because it is the sum of squares of real numbers. 
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Expanding, we find: 


J x[00.9) 28) + ep0(@ % 8,6) Eyal X 250) 
> +2 i d*x (Eyvh : Oyb X dy) (16.46) 


The two terms on the left are actually the same, since €yy€p¢ = Sypdve + perm. 
Finally, we arrive at: 


E > 4n|Q| (16.47) 
The equality E = 471|Q| will be reached for the self-dual solutions: 


Oud = L€yvh X (0,0) (16.48) 


16.2 Monopole Solutions 


In addition to these two-dimensional toy models, we have the more complicated 
monopole solutions of gauge theory. Before we discuss the properties of ihe gauge 
monopole, let us review the properties of the Dirac magnetic monopole? found in 
ordinary electrodynamics. The Dirac monopole is based upon a straightforward 
generalization of the electric monopole. By analogy, the electric field E of a point 
electric charge can be generalized to the magnetic field B of a point magnetic 
monopole: 


E=e5 + B= (16.49) 


Then Maxwell’s equations are generalized to include a nontrivial divergence 
of the magnetic field: 


V -E=42eS(r) 3 V-B=4mg67(r) (16.50) 


If we express these fields in terms of potentials E = —V¢ and B = V x A, then 
we seem to have a contradiction. Usually, the magnetic field, because it has no 
sources, can be written in term of the curl of the vector potential. This is because 
the divergence of a curl is equal to zero; that is, V -V x A = 0. (This is because 
0;0;€'/* = 0 since €'/* is antisymmetric.) 

However, it is possible to evade this identity if there is a delta function type 
singularity in the A field. To see this, let us take a sphere surrounding the point 
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monopole. At the top of the sphere, there is a small circle that is centered around 
the north pole. The flux of magnetic field through this circle is given by: 


[sas 


[vxacas 


ga dl (16.51) 


If the circle is infinitesimally small, including only the north pole, then the line 
integral of the A field around this infinitesimally small circle is zero. However, if 
the circle is made successively larger, until it includes the entire sphere, then the 
surface integral over the B field is given by 427g. However, the line integral over 
the A field must be zero because the loop has become an infinitesimally small 
loop surrounding the south pole. To avoid this contradiction, the A field must be 
singular along the negative z axis. There must be an unphysical singularity that 
extends from the origin down to the south pole and beyond. This singularity is 
called the Dirac string. 

In addition to the Dirac string, there is yet another curious property of magnetic 
monopoles. When we apply quantum mechanics to monopoles, we find that the 
magnetic monopole charge g cannot have arbitrary values; that is, the monopole 
charge is quantized. 

To see this strange effect, notice that a wave function y in the presence of a 
monopole must be single valued when we go around the Dirac string. A plane 
wave is given by: 


yw ~ exp(i/h)(p-r— Et) (16.52) 


The wave function, in the presence of a magnetic monopole, can be obtained 
by making the standard substitution: p — p — (e/c)A. With this substitution, the 
wave function picks up a new phase factor given by: 


exp — iA -r) (16.53) 


In order for the wave function to be single valued when we go around a loop, 
this factor must be equal to one. The line integral around the Dirac string must 
therefore be 277n, where n is an integer. Then we have: 


e 
2 = — PA-dl 
mn = 


(4 
=) ee 
mal dS 
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= <4rg . (16.54) 
ch 


Therefore the final quantization condition is given by: 
e=n— (16.55) 


This quantization of the monopole strength is a rather curious result, which shows 
that the quantum mechanics of magnetic monopoles yields novel features. 

We would like to make a final remark about Dirac magnetic monopoles. One 
can find fault with the previous presentation because of the existence of the singular 
Dirac string. Although the Dirac string can be moved in any direction and also 
has no physical consequences, one suspects that there is another formulation of 
the monopole in which the Dirac string is absent. This new presentation of the 
magnetic monopole uses the theory of fiber bundles. It has the advantage that the 
presentation is completely nonsingular and also is formulated in a well-established 
mathematical formalism. 

Let A be the vector potential for the previous monopole, in which the Dirac 
string goes through the south pole. However, there is, of course, another vector 
potential A in which the Dirac string runs through the north pole. Our strategy 
is to split the sphere surrounding the magnetic monopole into two pieces along 
the equator. For the northern hemisphere, we take the field configuration A and 
simply throw away the Dirac string running through the south pole. In the southern 
hemisphere we take the field configuration A (and throw away the Dirac string 
that runs through the north pole; see Fig. 16.2). 

Thus, A defines the monopole field in the northern hemisphere, while A 
describes the field in the southern hemisphere. Neither A nor A are singular. 

However, there is a price we have to pay for this sophisticated construction; 
that is, we have to piece together these two distinct patches in order to cover 
the sphere. We will “glue” the two vector potentials along the equator. The final 
gluing process between these two different field configurations is accomplished by 
making a gauge transformation between the two configurations along the equator; 
that is: 


A=A+VQ (16.56) 


Since a gauge transformation cannot affect the physics, we now have a de- 
scription of the field configuration that covers the entire sphere. To see how this 
gluing is actually accomplished, let us write down the explicit representation of 
the vector fields. For A, we have: 
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Figure 16.2. A Dirac string. The total magnetic vector potential of a monopole is obtained 
by splicing the field of the northern hemisphere of the diagram on the left with the field of 
the southern hemisphere of the one on the right along the equator. 


ae 
or +2) 
A=) (16.57) 


Ay = 


Actually, a more convenient description of the monopole field is given in terms 
of spherical coordinates. Let 0 be the polar angle, which is 0 at the north pole and 
m along the south pole. Let ¢ be the azimuthal angle, which ranges from 0 to 27. 
Then the field configuration is given by: 


Ae = >80 
Ag = O 
1+cosé 
Ag = oy (16.58) 
r sing 


Notice that we have two solutions, given by the sign of +. The — solution 
corresponds to A, while the + corresponds to A. 

We can now “glue” the two configurations together along the equator by a 
gauge transformation: 


Ag = Ag — = Ag —(i/e)SVgS! (16.59) 


rsin@ 
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where: 


S = er!sed (16.60) 


16.3 °t Hooft—Polyakov Monopole 


The previous discussion of magnetic monopoles, although interesting, was not 
compelling, because ordinary electrodynamics does not require that monopoles 
exist. Electrodynamics without monopoles is perfectly consistent. However, 
in certain gauge theories, we will find that spontaneous symmetry breaking is 
intimately connected with the existence of monopole solutions. Hence, monopoles 
must exist for these theories as a consequence of broken gauge symmetry. 

It can be shown that pure gauge theory does not, by itself, possess any static 
nonsingular monopole configurations. However, a more general case, such as 
gauge theory coupled to scalar fields, does possess monopole solutions. 

We begin with the standard gauge action with scalar fields, with the gauge 
group O(3): 


1 1 
FG = aur + 5 Dud" DY gp" 
1 
—zm'o" o* — - oe) | (16.61) 


One can show that there exists a solution with the asymptotic behavior (r — oo): 


r? 


en 
Ap — 0 
¢? (—6m? /2)— (16.62) 


(We have made a nontrivial linkage between three-dimensional physical space and 
three dimensional isospin space.) One can show from this that @% is covariantly 
constant at infinity (i.e., D,,.6* = 0). 

This is the ’t Hooft-Polyakov monopole.*> To compare this monopole, defined 
for O(3) symmetry, with the usual Dirac monopole, we will have to define a new 
Maxwell tensor F,,, that will reduce to the usual one when $* becomes fixed in 
isospin space. We define: 


Puy = On Ay a dvAy — 7 Eabc "(Oy ¢” (0, p ) 


oP 


p* Al, (16.63) 


Ay ia] 


III 
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With this definition, we can now calculate the magnetic and electric charge of the 
monopole. We find that A,, = 0 and that: 


1 pe 
Fo; = 0, Fij = a By = ee, (16.64) 
er er 
With this value of the magnetic field, then, we can show that the total flux through 
a sphere surrounding the monopole is given by 47/e. But the total flux of a 
monopole is 47g, so the monopole magnetic charge then obeys the constraint: 


eas (16.65) 


which is twice the Dirac case in Eq. (16.55). 

To reveal the topological nature of these monopole solutions, we remark that 
the sole contribution to F,,,, comes from the Higgs sector, since A, = 0. The 
magnetic current is given by K“ = 3,F#” and can be written entirely in terms of 
Higgs field @* = 67/|o|. A direct calculation shows that the conserved magnetic 
current equals: 


1 ee ee 
K# = = 556 €abc dvb" Ip" Jo $° (16.66) 


Since 0, K” = 0, the conserved magnetic charge can be written as: 
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where we have integrated by parts, so that this volume integral becomes a two- 
dimensional surface integral taken over S> at infinity, which is the boundary of 
the static field ¢. 

Comparing this with the definition with the winding number in Eq. (16.44), the 
magnetic charge M is proportional to the winding number that maps the sphere S> 
(in two-dimensional space) onto S> (in isotopic space). But we know topologically 
that: 


12(S2) = Z (16.68) 


so we are left with M =n/e, where n is the winding number. 
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Finally, the previous results may be generalized to more complicated, phe- 
nomenologically acceptable groups. The key element of this monopole solution 
was the existence of a function ¢ that smoothly mapped S> onto S>. If we have a 
gauge group G that is broken down to the subgroup H, then monopole solutions 
will exist if there are nontrivial mappings of S onto S>; that is: 


m(G/H) =Z (16.69) 


where G/H is called the coset space, (G/H is the set of elements 2; of G such 
that g; is equivalent to g> if g; = g2h for some element h in H.) Any gauge theory 
with this group property may have monopole solutions. For example, this can be 
satisfied if H has U(1) factors. 

For example, the GUT theory based on SU(S) can be shown to have monopole 
solutions because it has a nontrivial homotopy group. In addition, it can be shown 
that these monopoles have finite energy and mass given, after symmetry breaking, 
by roughly 137My, where My is a vector meson mass, so the monopole can 
be extremely heavy. (Any gauge theory with nontrivial homotopy groups can 
have monopole solutions, and hence must account for the experimental fact that 
monopoles have not been conclusively seen. This, in turn, places important limits 
on the production rates for monopoles in the early universe.) 

Finally, we remark that it is possible to develop a complete quantum theory of 
these classical solutions, for example, a theory in which we can study the quantum 
scattering of solitons against each other, including loops. The complete quantum 
theory of solitons, however, is beyond the scope of this book. Instead, we will 
now turn to another classical solution of field theory, the instantons. 


16.4 WKB, Tunneling, and Instantons 


One of the oldest nonperturbative methods is the semiclassical or WKB approach 
used in ordinary quantum mechanics. One of the advantages of the WKB approach 
is that we can calculate tunneling effects that are beyond the usual perturbative 
method. To any finite order in perturbation theory, we will never see any of these 
nontrivial nonperturbative effects. The WKB approach also naturally leads to the 
concept of instantons,°’ which have proved to be a powerful tool to probe the 
nonperturbative regime of gauge theory. In particular, we will show that QCD 
instantons force us to re-examine the whole question of C P violation. 

We begin our discussion of instantons by considering / corrections to the 
classical limit. To see the relationship between / and the perturbative coupling 
constant g, let us rescale the ¢ field found in g* theory as @ — ¢/g. Under this 
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Figure 16.3. Quantum mechanically, a wave can tunnel across a barrier. WKB methods 
give us the transmission probability. 


rescaling, the Lagrangian transforms as: 
I 
-2(o) > Td (o) (16.70) 


where the mass also gets rescaled, and where ’ is the action where the coupling 
constant has been rescaled to unity. Classically, the coupling constant g is not 
important: if we can solve it classically for any value of g, then we can also solve 
it for any other value of g. It can always be rescaled to one. 

Quantum mechanically, things are a bit different, because we also have the 
quantity/i. The factor appearing in the path integral is S/h, which can be rescaled 
as: 
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(16.71) 


Thus, the weak coupling expansion is identical to an expansion inf in the semi- 
classical approximation. The essential dimensionless parameter is 97/. 

We know from ordinary quantum mechanics that it is possible for a wave 
to tunnel from one side of a potential well to the other side (Fig. 16.3). The 
transmission amplitude is given by the WKB result: 


r=exp| (—; [oa avs Ei) (1 +00 (16.72) 


The important point is to note that the tunneling amplitude occurs as exp(—1/fi - - -), 
and hence tunneling can never be seen to any finite order ink. By the previous 
rescaling argument, this also means that tunneling can never be seen to any finite 
order in perturbation theory. 

The WKB method, as it was originally formulated in nonrelativistic quantum 
mechanics, consisted of solving the Schrédinger equation separately in different 
regions. Then, by matching the wave function at the boundary of the potential, 
we could calculate the leakage through the potential barrier. 
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To generalize these semiclassical methods and recast them in the language 
of path integrals, let us first define the partition function of a system with a 
Hamiltonian H with an Euclidean metric as follows: 


Z(p)=Tre" (16.73) 


The fact that we have Wick rotated to a Euclidean metric, so the exponential 
appears with a real argument, is essential to our discussion. Our approach to the 
WKB method, as applied to path integrals, is to find classical solutions to the 
Euclidean equations of motion and then to integrate functionally over quantum 
fluctuations around these classical solutions. 

If we trace over the Hilbert space of eigenstates of the Hamiltonian, then the 
partition function can be written as: 


Z(6)'= Seer (16.74) 
n=0 


We set B to be 1/kT, where T is the temperature of the system and k is Boltzmann’s 
constant. For our purposes, however, we will interpret B to be the Euclidean time 
T. 

As B — , at large Euclidean times, the right-hand side vanishes, but the 
state of lowest energy Ey vanishes slower than the rest. To extract the lowest 
energy eigenvalue Ep, we therefore take the logarithm of both sides: 


1 
=—— ji 16.75 
Eo 2 ee) (16.75) 


Thus, the advantage of examining the Euclidean partition function is that we can 
analyze the ground-state energy of the system. Furthermore, if we calculate the 
imaginary part of the energy of an unstable state, we can find the decay width, 
which in turn gives us a derivation of the tunneling rate given earlier. 

Let us now write the partition function in terms of path integrals involving a 
specific potential function V(x): 


Z(B) hoger! 


x(B) 1 B 1 
Dx exp——= ( / Hp ee vi) (16.76) 
x(o) & 0 2 


whre x(0) = x(B), and where we have rescaled x — x/g to extract the coupling 
constant in front of the action, and where f is treated like a Euclidean time. 
(Notice that the potential appears with the opposite sign than is usually found in 
the Minkowski path integral.) 
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Figure 16.4. To analyze the tunneling between the two minima in the diagram on the left, 
we must invert the potential and find classical solutions for the inverted potential on the 
right connecting the two maxima. These are the instantons. 


The stationary points of the path integral can be found by using the Euler— 
Lagrange equations of motion: 


ue 
x= 


= 16.77 
Ox ( ) 


Because the sign of the potential is reversed from the usual one, we must now 
solve the equations of motion in a potential that is upside down. 

In Figure 16.4, we see a typical double-well potential V. The quantum- 
mechanical problem can therefore be solved if we know the classical solutions to 
the problem with the potential reversed, with Euclidean metric. We know from 
ordinary quantum mechanics that a state that is concentrated in one part of the well 
may tunnel into the other. To solve for the tunneling between these two states, 
we must turn this picture upside down and solve for the motion of a classical 
body with this new potential. Intuitively, this corresponds to solving the classical 
problem of a ball rolling down one hill and arriving at the other hill. 

The simplest classical solution for the system is just the static one: 


de) SS se (16.78) 


where the particle just sits at the top of each potential and remains there. If we 
insert this solution back into the action S, we find that it corresponds to zero action. 

A more interesting case is when the ball rolls down one hill and then up 
the other, until it stops at the other maximum of the reversed potential. Let the 
classical solution to this simple problem be given by xo: 


x(t) = Xo(t = To) (16.79) 


If we graph what this solution looks like classically, we find Figure 16.5(a). If we 
then insert this solution into the Lagrangian, we find the graph in Figure 16.5(b), 
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Figure 16.5. In (a), we see a plot of the solution x(t). In (b), we see a plot of the Lagrangian 
evaluated at the classical solution. 


which is a rough plot of the function: 
1, 
(Reg 570 + V (xo(T)) (16.80) 


Because this solution creates an almost instantaneous blip in the Lagrangian, we 
call this finite-action, classical solution to the Euclidean problem an instanton. 
Not surprisingly, we will call the solution that takes us back from the hill to the 
original one an anti-instanton. Because we are taking the trace in the partition 
function, we are integrating over all states which start at x = —a and wind up back 
at x = —a. Thus, instantons and anti-instantons occur in pairs in the partition 
function. 

The summation in the partition function, of course, must also sum over multi- 
instanton solutions as well. Because the instanton and anti-instanton create only a 
momentary distortion in the Lagrangian, itis areasonable assumption to replace the 
sum over the complete multi-instanton solution with the sum over noninteracting 
instanton and anti-instanton solutions appearing sequentially. Since each instanton 
and anti-instanton appears only briefly, this approximation is a relatively good one 
and is called the “dilute gas approximation,” after a similar approximation found 
in statistical mechanics. It treats multi-instanton solutions as if the instantons 
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and anti-instantons were dilute (i.e., their density is low, and they act like free 
noninteracting gas molecules). 

We obviously neglect the overlap between instantons in this approximation, 
so the contribution to the action by n pairs of instanton—anti-instantons Seeris 
roughly given by the sum of the individual contributions: 


S& = 2nSo (16.81) 


Now we would like to calculate the contribution of these instantons to the 
partition function, with the goal of calculating the ground state and decay rate for 
this quantum-mechanical problem. We will expand the functional integral around 
the classical solution for the zero-instanton and the one-instanton case as follows: 


—a+té&(t — T%}) 


Xo(T — T) + E(t — To) (16.82) 
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where &(t) represents the quantum fluctuation around the classical solution. &(r), 
in turn, can be decomposed into eigenfunctions: 


&(t) =) enbn (16.83) 


n 


where &, are a complete set of eigenfunctions or normal modes. If we power 
expand around the classical solution to the action, we find: 


VO) Vixp) + &2V" xo) ++ = 


Ss = Sy + f dr (58+ v's) 4+: (16.84) 


The key assumption we will make is that we can ignore the higher corrections to 
the potential and the action. This approximation is quite good near the bottom of 
the well, where the potential is approximated by a quadratic function, but is less 
reliable away from the minimum. 

Let Z, represent the contribution to the path integral of n instanton—anti- 
instanton pairs. After we make this approximation, we find: 


B 
i exp (-[ 5 +08") dt 


[det (a2 + w?)] (16.85) 
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for the zero-instanton contribution, where w = V,’"(0), and: 


B 
en if DE exp (- / se +v"e) dt 
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e~ [det (—a2 + v")] 


Z\ 


(16.86) 


for the one instanton—anti-instanton contribution. All determinants are evaluated 
with respect to the eigenfunctions &,,. 

In the dilute gas approximation, the complete partition function is given by 
the sum over all the multi-instanton contributions, so: 


Z(B)=Zo+Z2+2Z4:°- (16.87) 


Our task is now to find an expression for Z,, in terms of Zo and Z;. 

In this approximation, the higher Z,, can all be reduced because the functional 
integral factorizes. The two-instanton contribution, for example, consists of a 
functional integral over two regions I and II, as in Figure 16.6. The functional 
integral factorizes as the product of [], dx(t) e~* where tT ranges over regions I 
and IT: 


22 =2Z,(0)Z;{01) (16.88) 


Instanton 


X(T) 


Anti-instanton 


Figure 16.6. The instanton—antiinstanton contribution is shown in this diagram consisting 
of an integral over regions I and II. 
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where Z,,(R) is the functional integral over the n-instanton configuration evaluated 
only in region R: 


z(R)= f--- | [[a@ ef at Be) (16.89) 


tER 


Likewise, Z, and Zp can be factorized as functional integrals over regions I 
and II: 


Zo = Zo(/)Zo(1 1); Z, =Z\(1)Zo(1T) (16.90) 
By multiplying and dividing by Zo we now easily have: 


Vie ig 
Z.= 3Z, (16.91) 
(The 5 factor comes from the restriction that the position of the instanton 1s taken 
to be larger than the position of the anti-instanton. If we remove this restriction, 
then we must compensate by dividing by 2.) 

By continuing to factorize the functional integral into the product of Zo(R) 
and Z,(R) over different regions, we can then show that: 


1 
poe Gnir 20 (16.92) 


where T = Z,/Zo. All Z2, are now expressed in terms of Zp and Z,. If we sum 
over all the multi-instanton contributions, we find: 


co co 1 
Z(6) = ) Zm =Zo) ant 
n=0 n=0 ( n)! 
=» Zo cosh] 
> een (16.93) 


where we took the limit as 8 — oo in the last line. 
In the limit of large B, the contribution of the vacuum to the partition function 
gives us the standard harmonic oscillator result: 


co 
Le ee (16.94) 
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In the presence of instantons, however, we expect to find a quantum correction 
to this: 


hi 
ge > +e (16.95) 
where: 
e=— lim y (16.96) 
poo B ; 
and: 
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The final answer is therefore: 
oh 7 
Ege a +hKe (16.98) 
where K is the ratio of the two determinants. 
Taking the imaginary part, we find that the decay width is: 
T =h|K |e~*" (16.99) 


which is the original WKB result presented earlier. To see this, we note that the 
classical solution x.) obeys 5(%c1)* = V(Xc1) so that: 


oo 1 [o @) 
y= i dt (54+ v) = / (Xe1)* dt 
—oo 2 —oo 


XZ X2 
i, ita dx = [ V2V(x) dx (16.100) 


so that the tunneling amplitude is proportional to: 
a i 
exp (-;f V2V ax) (16.101) 


as in the WKB result quoted earlier in Eq. (16.72). 

[We note that we omitted some subtle details concerning the determination of 
K in our final expression. In particular, a naive calculation of K actually vanishes 
because of zero modes, since the determinant in Eq. (16.97) is the product of 
the eigenvalues, which can be zero. The zero mode is due to the time translation 
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invariance of the system, and hence we must be careful in integrating over all 
positions of the instanton. For a careful determination of K and how to handle 
zero modes, the reader is referred to the literature on instantons. ] 


16.5 Yang-Mills Instantons 


The purpose of discussing instantons is to probe the nonperturbative realm of gauge 
theories. We will see that the theory of finite-action solutions to the Euclidean 
Yang-Mills theory has profound implications for the nature of QCD. In particular, 
we will be interested in considering the implications of classical solutions to the 
Euclidean Yang-Mills equations of motion, which are self-dual. 

If we define: 


1 
Fuy = 3 fuvop Fap (16.102) 


then a classical solution is self-dual if: 


™! 


nv = Puy (16.103) 
Our first task is to calculate the action corresponding to a self-dual solution 
to the Euclidean Yang—Mills theory. We begin with the simple observation that 
the sum of squares of a sequence of numbers must necessarily be greater than or 
equal to zero: 
Tr (Fyuy — Fy)? > 0 (16.104) 
Let us now expand the terms in the sum. We use the identity: 
EnvopEuvap = 2(bpp5ca — Spadap) (16.105) 
From this, we can show: 
|e aod Pray dy (16.106) 
Therefore, our inequality now reads: 


Tali 2 It Bee! (16.107) 


Our original task, to calculate the action corresponding to a self-dual solution, 
is now reduced to calculating the integral of F F. This is easily accomplished by 
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observing that F F is actually a total derivative: 


l 2 
qT Fu Fv = 8K, (16.108) 


where a direct computation shows that: 


. 
Ku = €vap Tt (Fredy = FAvAeAs) (16.109) 


Because of this, we can integrate over the volume of four-space: 
/ TrFiyFiyd'x = 4 i 0K, d*x 


(16.110) 
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Normally, in field theory we expect that the integral of a total derivative should 
vanish. However, the field may vanish slowly enough at infinity so that we can 
have nonzero values of this integral. In fact, as we shall demonstrate shortly, this 
integral equals an integer: 


n | OMB oa ilies elas (16.111) 


~ 3272 


Putting everything together, we now have: 


1 4 re 
S = 42 d xX (Fi) 
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2 
= aaa (16.112) 
& 
and therefore: 
8 2 
S > Ssetf—dual = oe (16.113) 


As desired, we have now shown that a Euclidean, self-dual solution, if it exists, 
has finite action, labeled by n, which will be called the winding number. Inserting 
this value of the action back into the path integral, the contribution of the self-dual 
solution to the functional integral is given by: 


eer (16.114) 
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This clearly shows the nonperturbative nature of the instanton. The contribution 
of the instanton is proportional to exp(—1/g”), which can never be approximated 
to any finite order in perturbation theory. Thus, the instanton contributes nonper- 
turbatively to the gauge theory functional. 

We still, however, have not touched upon a few important questions: First, do 
these self-dual solutions really exist, and, if so, what do they look like, and what 
possible physical consequence can Euclidean solutions have upon our Minkowski 
world? 

To answer these questions, we start with the one-instanton solution for SU (2) 
Euclidean Yang-Mills theory. From our previous discussion of instantons, we are 
led to postulate a form for the gauge field that asymptotically goes to the vacuum 
solution A,, > (—i/g)(0,2)Q~!. We are led to postulate the form®: 


% 
: x pa 
A, = (~i/9) 7 MO ; (16.115) 


where A is an arbitrary parameter and where: 


X4 + 10;X; 


= (16.116 
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Since N'2 = 1, we have: 
x? =x 4x? (16.117) 


We can also generalize this single-instanton solution to the more general case. 
Let us define: 


Aia = (€aikO% F 5ai00) log f 
Ae +d,log f (16.118) 


where i, k = 1, 2,3 and we have deliberately mixed up space and isospin indices. 
The conditionF,,, = +F,,, fixes a constraint on f: 


os S30) (16.119) 


If we choose f = 1 + A?/x?, then we recover the previous solution with winding 
number n = +1. However, we can also choose: 


f@=) — 5 (16.120) 
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which corresponds to a multi-instanton solution, where the various instantons are 
located at x;. This solution is parametrized by an arbitrary integer 7. 

Next, let us explore the asymptotic properties of this instanton. If we take the 
limit x4 — -+too, then the self-dual instanton solution reduces to: 


Ay > (-i/g) (8,2) 27! (16.121) 


that is, it approaches the vacuum solution at infinity, as desired. In particular, we 
find that: 


X4 —> 003 A; > i(D,)7'()0,) 


X4 > —00; A; 3 i(OQn—1)7'(8;Qn—1) (16.122) 
where: 
. Xj O; 
p= Sa = exe (-in ee (16.123) 


This is a rather surprising conclusion. It shows that the n-instanton solution, 
at x4 = too, connects two different vacua, which differ by one unit. One vacuum 
has winding number n — |, and the other has winding number n. (This is similar 
to the instanton solution we found in Eq. (16.79), which connects the two vacua 
alee — eae) 

We now can give a mathematical meaning to the index n in Eq. (16.111). Let 
us specialize our case to the group SU(2). The elements 22 of SU(2), in turn, can 
be put in correspondence with the points that label a three-dimensional sphere, as 
in Eq. (16.117). Thus, for each point x, on a three-dimensional sphere $3, we 
can generate an element 2 of SU(2). 

Since, at asymptotic times, the gauge field becomes a pure gauge field: 


Ay > (-i/g) (8,2) 27! (16.124) 


then a pure gauge configuration is labeled by a three-dimensional surface, given 
by a hypersphere 53. 
Let us now insert this value of A,, into the expression for K,,: 


Suna TPQ” OP(O ag = "e50) (16.125) 
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To make some sense out of this expression, we will parametrize the invariant 
SU(2) group measure dU (which we introduced in Chapters 9 and 15) as follows: 


dU = (01, 02, 03) do; do» doz (16.126) 
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where U is an element of SU(2) parametrized by some coordinates 0;. Let Up be a 
fixed element of SU(2), and U’ = UU. If {o } are the coordinates that parametrize 
U and {o’} the coordinates that parametrize U', then the group measure obeys: 


aU ade 
p(G{, 05, 03) da; doz do; (16.127) 


that is, the group measure obeys dU = d(UpU) for fixed Up. 
Then there is a theorem from classical group theory that states that the invariant 
measure 0(01, 02, 03) is given by: 


(01, 02, 63) =€!* Tr (u" su ue) (16.128) 
With this expression, one can check explicitly that: 
p(o;) = pa!) Det = (16.129) 
Then the index n is given by: 
n = ee i a,K, d*x 
: — f Envap Ay Tr (A-!a,0) (78,2) (A-14,2) do 
= Fee (16.130) 


In the last line, we have the integral over the invariant volume element in the 
group Saat ie surface term in Euclidean space F4 is taken as r — oo, where 
r= (xj +.x5 +x? +.x2)!/?. This boundary, of course, is the hypersphere S3. Thus, 


the index n gives us the degree of mapping from: 
S3 > S83 (16.131) 


that is, it gives us the number of topologically distinct ways in which the surface 
of $3 can wind around another $3. 

This formalism thus gives us a nontrivial mapping from one $3 onto another; 
one $3 represents the isotopic space of S$U(2) denoted by (2, and the other $3 
represents physical space, the boundary of Euclideanized space, denoted by the 
boundary of the integral over x-space . 
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This mapping from $3 — S; is called 73(53). In topology, one can show: 
73(S3) = Z (16.132) 


Thus, the mappings of S$; — $3 are characterized by integers; that is, the points 
of one S; can be mapped smoothly to another S$; such that we wind around $3 an 
integer number of times. This now explains the mathematical origin of the index 
n, 

This clearly demonstrates the highly nontrivial nature of the gauge instantons. 
It reveals the fact that the naive vacuum of Yang—Mills theory is the incorrect 
one, that there are actually an infinite number of topologically distinct vacua, each 
labeled by an integer n. 

This shows that the vacuum of Yang-Mills theory actually consists of an 
infinite number of degenerate vacua, so the true vacuum must be a superposition 
of all of them. 


16.6 6 Vacua and the Strong C P Problem 


Finally, we comment on the physical interpretation of the theory of instantons®’. 
In ordinary quantum mechanics, we know that nonperturbative effects, such as 
tunneling, can be computed using the WKB method. This formalism, in turn, 
requires finding solutions to the Euclidean equations of motion that connect two 
classical solutions at x, — -too. We now see the true significance of instanton 
solutions: They allow tunnelling between different vacua because they connect 
these vacua at x4 — oo. 

The naive vacuum is thus unstable. The instanton allows tunneling between 
all possible vacua labeled by winding number n. Thus, the true vacuum must be a 
superposition of the various vacua |”), each belonging to some different homotopy 
class. 

The effect of a gauge transformation (2; in Eq. (16.122) is to shift the winding 
number n by one: 


QQ: \n) > \n+1) (16.133) 


Since the effect of 2, on the true vacuum can change it only by an overall phase 
factor, this fixes the coefficients of the various vacua |n) within the true vacuum. 
This fixes the coefficients of |n) as follows: 


fo) 
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We can then check that the effect of 2; on the true vacuum is to generate a 
phase shift: 


Q;:  |vac)g > e?|vac), (16.135) 


The presence of instantons means that the true vacuum is parametrized by 
the arbitrary number 6. The effect of this 6 dependence can be also expressed 
by writing down an effective action. To do this, recall that in Chapter 8 we 
wrote the expectation value (x|e~'”*'|x,) as a Lagrangian path integral, where 
the integration over Dx connected the configurations x, and x2. Likewise, we 
may write the expectation value (m|e~'”*'|n) as the path integral over DA,, that 
connects the mth vacuum with the nth vacuum: 


(mle'""\n) = [IDAs ln exp (-1f ¥ a‘) (16.136) 


where we integrate over all A,, of the same homotopic class with winding number 
v=m—n. 
This allows us to write the vacuum-to-vacuum transition as: 
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x exp (-i if ast) (16.137) 


where we have obtained the delta function by summing over m, and where [DA,, ] 
connects two vacua with different winding numbers. The phase factor e~'”? can 
be absorbed into the action. We know that v = (1/1677) f d*x Tr FF, so we can 
add it to the Lagrangian, giving us an effective Lagrangian: 


g(vac|e'""|vac)g 


Ze = Br0v=B+ at ape) lage (16.138) 
This is a rather surprising result, that the effect of the instantons is to create 
tunneling between degenerate vacua, which in turn generates an effective action 
with the additional term F F. The presence of this extra term in the action does not 
alter the theory perturbatively, since it is a total derivative and hence never enters 
into the perturbation theory. Perturbatively, we therefore never see the effect of 
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this term. However, nonperturbatively, it will have an important effect on the 
physics. 

So far, our discussion has been rather abstract. We will now show that the 
instanton solution has an immediate impact on QCD. The instanton solves one 
problem {the U(1) problem] but also raises another (the strong C P problem). 

To understand the U(1) problem, let us first catalog all the global symmetries 
of QCD. In the limit of zero quark masses, QCD for the up and down quarks is 
invariant under chiral SU(2) ® SU(2). This is because: 


Q Bq =G. Bar +4r Par (16.139) 


so the left- and right-handed sectors are separately invariant under SU(2), and 
SU(2)p. 

QCD is also invariant under two global U(1) transformations. The first U(1) 
transformation leads to a conserved current: 


= ae (16.140) 


which give us baryon number conservation, which is, of course, seen experimen- 
tally. However, the second U(1) symmetry is given by the transformation: 


Wa > eS Yq (16.141) 
This leads to the current: 


I=) Va¥urs¥ra (16.142) 


Although QCD is classically invariant under global SU(2) @ SU(2) ® U(1) @ 
U(1), quantum corrections to QCD may alter this symmetry in various ways. 
There are three possibilities: 


1. A symmetry may be preserved by quantum corrections, in which case the 
particle spectrum should manifest this symmetry. 


2. The symmetry could be spontaneously broken, in which case there are 
Nambu-Goldstone bosons. 


3. The symmetry may be broken by quantum corrections, in which case the sym- 
metry is not manifested in the particle spectrum and the Nambu—Goldstone 
boson is absent. 
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For example, chiral SU(2) symmetry is believed to be spontaneously broken, 
so there must be Nambu—Goldstone bosons associated with this broken symmetry. 
The Nambu—Goldstone bosons are the triplet of 7 mesons. 

The axial U(1) symmetry, however, is more problematic. If it is preserved, 
then all hadrons should be parity doubled. This is not the case, since the 7 meson 
has no scalar partner. 

The second possibility is that the axial U(1) symmetry is spontaneously broken, 
in which case there should be a light Nambu—Goldstone boson. However, there 
is no Nambu—Goldstone boson around the x meson mass. Weinberg has proved 
a theorem that says that the U(1) Nambu—Goldstone boson should have mass less 
than J3mz- However, there is no such particle. The particles that come closest, 
the 7(549) and the n/(985), fail to satisfy the Weinberg bound, and (549) is 
actually part of the pseudoscalar octet. 

The U(1) problem, therefore, is to explain the absence of both parity doubling 
as well as the Nambu—Goldstone boson for this symmetry. 

This leaves open the third possibility, that the symmetry is not preserved 
quantum mechanically. Indeed, one might suspect that the anomaly in the U(1) 
current makes it impossible to construct conserved currents. There is indeed a 
triangle anomaly, which breaks the conservation of the axial current. However, this 
is not enough to solve the U(1) problem. By slightly modifying the calculation 
of the triangle anomaly presented earlier to accomodate quark flavors, we can 
calculate the contribution of the anomaly to the current conservation condition: 


Ng? 
872 


en MeFi Fay) (16.143) 


where N is the number of flavors. However, using the fact that: 
Tr(Fp han) = 400m, (16.144) 


where: 


1 2 
Ky = 5 <uvpo Te (A’a? A® = zig’ A’ Aa’) (16.145) 


we can construct a current that is indeed conserved: 
5 
yi ies Ky (16.146) 


The modified conserved charge Qs is given by: 


d¢ 
aes = [as dJ” =0 (16.147) 
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Because of this, it appears that the modified current is still conserved, and that the 
U(1) problem persists even in the presence of anomalies. 

In conclusion, we seem to have exhausted all possibilities for the U(1) problem. 
However, the solution to the problem was pointed out by ’t Hooft, who observed 
that instantons can render the previous equation incorrect. He pointed out that 
there is yet another contribution to Qs that may break the symmetry. If we 
calculate the change in Qs between the distant past and the distant future, the 
presence of instantons can create a nonzero value for AQs. We observe that: 


AQs = fa “es =f ats ee 


Neg? 
2n2 


/ dole (FF) (16.148) 


Usually, A Qs is equal to zero because the right-hand side is the integral over a pure 
divergence, which vanishes at infinity. However, in the presence of instantons, 
the right-hand side does not vanish at all. We know that the instanton has a finite 
action, so the right-hand side is not zero and AQs is not zero. Thus, there is no 
Nambu-Goldstone boson because the current is not really conserved, and hence 
the U(1) symmetry was not a good one in the first place. 

(An equivalent way of stating this is to notice that the modified current is not 
gauge invariant. The Green’s functions for this modified symmetry may develop 
poles that naively indicate that there are Nambu—Goldstone bosons in the theory, 
but these Green’s functions are gauge variant, and these poles cancel against other 
poles. The gauge-invariant amplitudes, which add up both the gauge-variant 
particle and ghost poles, do not have a net pole, and hence there are no Nambu— 
Goldstone bosons.) 

Instantons, therefore, appear to give us a nice explanation for the fact that the 
Nambu-Goldstone boson associated with the breaking of axial U(1) symmetry 
is not experimentally observed. However, instantons solve one problem, only to 
raise another. 

We saw earlier that the instanton contribution to the effective action AY van- 
ishes perturbatively but may have nontrivial nonperturbative effects. In particular, 
because of the existence of €,v0p, it indicates that parity is violated by the strong 
interactions. T is also violated; so there is a violation of C P. This is rather disturb- 
ing, because C P is known to be conserved rather well by the strong interactions, 
as measured by the neutron electric dipole moment, which is known experimen- 
tally to obey d, < 10774 e-cm. This serves as an experimental constraint on the 
parameters of the Standard Model, since we can calculate the perturbative and 
nonperturbative (instanton) corrections to the neutron dipole moment. The per- 
turbative corrections to the moment can be shown to give a dipole moment much 
smaller than this, which then gives us a bound on the nonperturbative correction. 
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This constraint gives us the bound on 6: 


Buceukoi-? (16.149) 


This is the strong C P problem: If instanton effects necessarily contribute an extra 
parameter to QCD, then why is 6 so small? 

In principle, if one or more of the quarks had been massless, we could have 
absorbed this term and preserved C P invariance. For example, if the up quark 
had been massless, then we could have made the usual chiral transformation on 
yw, which creates a change in the action given by: 


5S = —ia / d*x (0,J™) (16.150) 


so that 5S = 2N;a. We could thus absorb the @ term by choosing an appropriate 
a. However, the up quark is massive, so this line of argument is ruled out. 

The simplest suggested solution to why @ is so small is to invoke yet another 
U(1) symmetry, the Peccei-Quinn symmetry,'° which is preserved by a combined 
QCD and electroweak theory. The presence of this additional U(1) symmetry 
would be sufficient to keep 6 = 0. 

To see how the axion hypothesis works, consider the possibility of CP vi- 
olation in QCD caused by introducing a complex, nondiagonal mass matrix M 
for the quarks: 4;Mj;q;. Classically, M can be diagonalized and made real by 
making a field redefinition of the quark fields g;, so C P violation does not appear 
as a consequence of a complex mass matrix M. 

In this field redefinition, we made a chiral transformation on the quark fields to 
eliminate an overall phase factor. Once quantum corrections are allowed, however, 
we can no longer eliminate this phase factor with a chiral transformation. Since 
the functional measure Dg D@ is not invariant under a chiral transformation (see 
Section 12.7), the chiral anomaly adds a 9 term to the measure, given by: 


2 
Dq DG — Dq DG exp (are Det Mf dx — ae (l@a151) 


Therefore the effective 6 is given by: 
6 — 6+Arg Det M (16.152) 


To eliminate this effective 6 term, consider adding a new field o to the QCD action 
given by: 


a : 1 
ZLaxion = W(Me™'?) + 55.0 "0 (16.153) 
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where o is the axion field, which couples to the quark mass term via a phase 
factor. [The axion arises as a Nambu-—Goldstone boson of the new broken U (1) 
symmetry of the quark and the Higgs sector.] 

Now perform another axial U(1) transformation on the quark fields that elim- 
inates the F F term entirely and puts all C P violating terms in the mass matrix. 
We then find that the mass term in QCD is multiplied by: 


exp (6+ Arg Det M —o) (16.154) 
Now make the trivial shift: 
o >~o0+6+Arg Det M (16.155) 


Since the axion is massless, the kinetic term is invariant under this shift, so the 
shift is sufficient to absorb all C P violating terms that appear exclusively in the 
mass matrix. 

In this way, the introduction of a massless axion field, to lowest order, can 
absorb all strong CP violating effects by a shift. (At higher orders, the axion 
develops a mass, although we can still absorb the C P violating terms.) 

Although the axion gives us a way in which the strong CP problem might 
be solved, experimentally the situation is still unclear. Experimental searches 
for the axion have been unsuccessful. In fact, the naive axion theory that we 
have presented can actually be experimentally ruled out. However, it is still 
possible to revive the axion theory if we assume that it is very light and weakly 
coupled. Experimentally, this “invisible axion,” if it exists, should have a mass 
between 10~° and 107? eV. The invisible axion!! would then be within the bounds 
of experiments. Phenomenologically, it has been suggested that the axion may 
solve certain cosmological problems, such as the missing mass problem (i.e., that 
only 1% to 10% of the mass of the universe is visible, and the remaining mass is 
invisible, in the form of “dark matter”). However, until the axion is discovered, the 
strong C P problem is an open question and much of this discussion is speculative. 

In summary, we have seen that the theory of solitons, monopoles, and instan- 
tons probes an area of quantum field theory that is not accessible by perturbative 
methods. 

Instantons (solitons) are classical finite-action (energy) solutions to the Eu- 
clidean (Minkowski) equations of motion which obey special properties. Their ex- 
istence proves that gauge theories are more sophisticated than previously thought. 
The existence of instantons, for example, is an indication that tunneling takes 
place in the theory. Instantons in gauge theory are useful in giving us a solution 
to the U(1) problem, but they also raise the question of strong C P violation. 
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16.7. Exercises 


1. Show that the N-instanton solution in Eq. (16.118) solves the Yang—Mills 
equations of motion. 


2. Given the one-dimensional Lagrangian: 


oS sit + nat — Sexy (16.156) 


Plot the potential and show that the instanton solution is given by: 


1 


1 
ATE (16.157) 


a) = 


Where on the potential curve does this instanton make tunneling possible? 


3. Let us integrate this Lagrangian from —f/2 to B/2. Show that the energy and 
action of this system are finite and are given by: 


E(B) = —2e 8 + O(e~*F) 
: = (j —2eF + ore) (16.158) 
g \6 
4. Consider the Lagrangian: 
1 
B= 5° +g" [1 —cos(ex)] (16.159) 


Again, graph the potential and show that the instanton solution is given by: 
a= tant) (16.160) 
V8 


Between what states does this instanton make tunneling possible? 


5. Show that the action is finite (when integrated as before) and that: 
S=-— (16.161) 
6. Consider a massless four-dimensional ¢* theory with the action: 


1 1 
5 ub) + 48? (16.162) 
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10. 


ie 


Show that the instanton solution is given by: 


1 2/22 
J/—g 1+A2(x — x0)? 


where A is an arbitrary constant. Show that the action is again finite, with: 


O(x) = + (16.163) 


5 = —— (16.164) 


. Prove by direct computation that the sine-Gordon equation is solved by the 


soliton solutions in Eqs. (16.24) and (16.25). 


. Consider the two-dimensional complex scalar theory with Lagrangian: 
F% =8,0'34"c + Voto) (16.165) 
Show that if V is given by: 
es [a —oto)? +67] (16.166) 
1+e? 


then a solution is given by: 


HPL 
a ay 
fe eee Nee 16.167 
. Osx) ) 
with: 
a = (lt+e*)(1—@”’) 


2V 1 — w(x — &) (16.168) 


i 


y 


. Prove that if the soliton system is translationally invariant, there is a zero 


mode in Eq. (16.97). 


For the ’t Hooft-Polyakov monopole, prove explicitly that the solution for 
A®, and ¢° in Eq. (16.62) solves the equations of motion of the monopole at 


large distances. 

For the nonlinear O(3) model, in Eqs. (16.33) and (16.34), define the vari- 
ables: 

261/(1 — $3); @2 = 22/(1 — $3) 

w+iar, P=hi tigr (16.169) 


£ 
fl 


iS 
ll 
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Show that: 
a1 = [2/1 — $3) id + @ 1 ds) (16.170) 
Show that the self-duality condition in Eq. (16.48) becomes: 


+i(b 32 $3) 
+i(@ 31 3) (16.171) 


dip 
dp 


12. Show that this self-duality condition can be rewritten as: 


Cigale Son 


Cahige eel 16.172 
Ox, 0X2 0x2 Ox) ( ) 


This means that the self-duality condition reduces to the Cauchy—Riemann 
condition. This, in turn, means that any analytic function of z = x; + 7x2 will 
satisfy the self-duality condition, and hence the equations of motion. 


13. Now choose the following analytic function: 
en(z) = [(z — 29)/A]" (16.173) 


where n is an integer. Show that Q in Eq. (16.41) can be written as: 


1 n?\z = zi Am 
= | 16.174 
g An | (A2" + 3 |z — 2/2")? ! 


Using polar coordinates, perform the integration and show: 
Q=n (16.175) 


as expected. 


14. Prove that the invariant measure given in Eq. (16.128) satisfies the property 
dU = d(UoU) if Up is a constant. 


15. Prove that the measure in Eq. (16.42) is generally covariant under a reparametriza- 
tion of the coordinates. 


16. Another theory with instantons is the C Py theory. We begin with N + 1 
complex scalar fields n,(x) =n. The Euclideanized action is given by: 


B= D,n*- Dn (16.176) 
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where: 
Din = (0, +iA,)n (16.177) 
We also have the constraint: 
N+1 
nn =e aad (16.178) 
a=]| 


Eliminate the A,, field by its equations of motion. Show that the resulting 
action is: 


ZF =(d,n"* - dn) +(m* - d,n)(n* - 0,n) (16.179) 
Show that this action is invariant under: 


a.m — (d,n+id,An)e'* 


n*-d,.n — n”*-d,n+id,A (16.180) 


17. For the C Py model, show that the positive-definite quantity: 


/ d’x (Dyn + i€yyD yn) - (Dyn + i¢,,D,n) > 0 (16.181) 
reduces to: 
2 / d?x [(D,n)* - (Dyn) + ie,,(Dyn*)- Dyn] > 0 (16.182) 
Show that this proves: 
S > 2n|Q| (16.183) 


where we define the topological charge as: 


1 
QO = -s [as evans 


~ 5 f dx €vv(Dyn")- Dyn (16.184) 


18. Prove that Eq. (16.109) solves Eq. (16.108). 


Chapter 17 


Phase Transitions 
and Critical Phenomena 


17.1 Critical Exponents 


Historically, there has been a fair amount of cross pollination between statistical 
systems and quantum field theory, to the benefit of both disciplines. In the past few 
decades, many of the successful ideas in quantum field theory actually originated 
in statistical systems, such as spontaneous symmetry breaking and lattice field 
theory. 

There are several advantages that such statistical systems have over quantum 
field theory. First, many of them, in lower dimensions, are exactly solvable. 
Thus, they have served as a theoretical “laboratory” in which to test many of our 
ideas about much more complicated quantum field theoretical systems. Second, 
even simple statistical models exhibit nontrivial nonperturbative behavior. While a 
rigorous nonperturbative treatment of quantum field theory is notoriously difficult, 
even the simple classical Ising model shows a rich nonperturbative structure. 

Thus, statistical systems have helped to enrich our understanding of quantum 
field theory. Even though they only have a finite number of degrees of freedom, 
they have served as a surprisingly faithful mirror to the qualitative features of our 
physical world. 

We begin our discussion of statistical mechanics by making a few basic defi- 
nitions. Whether discussing the properties of a solid, liquid, or gas, we will base 
our discussion on the classical Boltzmann partition function: 


Z =) exp (-#) (17.1) 
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where E,, represents the energy of the nth state, k represents the Boltzmann 
constant, and T represents the temperature. 

(There is a close relationship between this partition function and the generating 
functional of quantum field theory: 


Z= / Do expi i L(p) d*x (17.2) 


The difference, however, is that the field theory generating functional has an 
imaginary exponent in Minkowski space and is defined over an infinite number of 
degrees of freedom.) 

In statistical mechanics, the fundamental quantity we wish to calculate is called 
the free energy, defined by: 


F =—-kT logZ Cz) 


We say that a statistical model is exactly solvable if we can solve for an explicit 
expression for the free energy. 
As in field theory, the statistical average of any observable X is given by: 


be! _ En 
(X)=Z Ags exp ( =) (17.4) 


There are only a few models that are exactly solvable (usually in two dimen- 
sions).':* Some of them include the Ising model, the ferroelectric six-vertex model, 
the eight-vertex model, the three-spin model, and the hard hexagon model. There 
are also classes of solvable models, such as the RSOS (restricted solid-on-solid) 
models. 

One of the earliest successes of these models was their ability to describe the 
properties of simple ferromagnets. For example, if we know that an atom has a 
magnetic moment 42, then the energy of the atom in an external magnetic field H 
is given by the dot product: 


E=-p-H (17.5) 


For quantum systems, we know that the magnetic moment is proportional to 
the spin o;. Therefore it is customary to add to the action the term: 


Ho+H 06; (17.6) 


For systems with a magnetic field, for example, the magnetization M is defined 
to be the average of the magnetic moment per site: 


M(H, T)=N7'(o, +---+oy) (17.7) 
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In the limit that NV — oo, we can describe the magnetization as: 
MiH,T)= g FCH,T) 17.8 
’ Cem aH ’ ( . ) 
because taking the derivative with respect to H simply brings down o; into the 


sum. 
The susceptibility is then defined as: 


OM(H,T) 


lel, WS : 
x(H, T) oH (17.9) 
which is related to the second derivative of the free energy. 
Similarly, the specific heat can be defined as: 
0° F 
C=-T—, Wall 
or Cae) 


If there is a collection of spins o; arranged in some regular two-dimensional 
lattice, then we define the correlation function g;; between the ith and jth spins 
as: 


Sij = (o;0;) = (a;)(o;) (17.11) 


In general, we find that the function g;; will depend on the distance x separating 
the states, and at large distances, it will behave like some decreasing power of x 
multiplied by some exponential: 


gi wx te 7% (17.12) 


where & is called the correlation length. 

Near the critical temperature T., we find that these physical parameters, like 
the magnetization, either vanish or diverge. Intuitively, for example, we know 
that a magnetized substance begins to lose its magnetic properties as we increase 
the temperature and the spins become random. At the critical temperature, we 
find that the magnetic properties of the substance vanish. For example: 


Peewee pate ey Maes 
' (17.13) 
[Prettiest <5), 
where y is the critical exponent that describes the susceptibility slightly above 
the critical temperature. (We will use primed symbols to represent the critical 
exponent just below the critical temperature.) This is shown graphically in Figure 
17.1 for another quantity, the magnetization. 
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ig 


Figure 17.1. The critical exponent 6 governs the behavior of the magnetization M below 
the critical temperature. 


The magnetization vanishes at T,, and its behavior is determined by the critical 
exponent B: 


M~(T.—T/P (17.14) 


Then the critical exponent a characterizes the behavior of the specific heat near 
the critical temperature: 


CE.70 =7T,)* (17a) 


We can describe the behavior of the correlation length near the critical temperature 
as: 


aes (a Fis (17.16) 


In what are called second-order transitions, as the temperature approaches the 
critical temperature, the correlation length goes to infinity. (Because the system at 
criticality loses its dependence on a length scale for these transitions, the system 
becomes symmetric under conformal transformations. This means that we can use 
the constraints coming from conformal invariance to place stringent restrictions 
on the free energy at criticality. This will prove to be crucial in our discussion of 
scaling and the renormalization group.) 

At the transition point, the system loses all dependence on any fundamental 
length scale, so the correlation function exhibits a power behavior: 


2 717) 


Also, the magnetization, at the critical point for weak magnetic fields, obeys the 
relation: 


M ~ Hi (17.18) 
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We can summarize all this with the following simplified chart: 


Magnetization M=—0F/dH 


Magnetization Matt =, 


Specific Heat he = —T(82F /8T?) | (T —T.)-* (17.19) 


Corr. Function 


8ij ~ (G10;) 


Corr. Function 


gijatT = T, 


Corr. Length 


—x/ log g(x) 


u 


17.2 The Ising Model 


One of the first, and simplest, statistical systems to be analyzed was the Ising 
model’ in one dimension. Ising, who proposed and solved the model in 1925, 
showed that the system had interesting physical properties, with a critical point at 
H =F =: 

We begin by placing a series of spins 0;, which can take the values of +1, at 
regular intervals along a line. The energy of the system is given by: 


N N 
E(o)=—J)ojoju-H > 0; (17.20) 
j=l j=l 


where the jth spin only interacts with its nearest neighbors at the 7 — 1 and j + 1 
sites, and where H is the external magnetic field. 
Then the partition function can be written as: 


N N 
Ziti ee («2 ai0m+4 Dai) (17.21) 
a Fal =A 


where we have rescaled the parameters via K = J/kT andh = H/kT. 
We will find it convenient to introduce a 2 x 2 matrix: 


V(o, 0’) = exp (Koo" + (0 + o')) (17.22) 
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This matrix V, which is called the transfer matrix, depends on whether the spins 
are +1 or —1; that is: 


V(t+,+) V+, -) gkeh eK 
V= = (17-23) 
v(i-,4) V-,-) e-K eK 
Now comes the crucial transformation. We will rewrite the partition function 
as a Sum over a series of matrices: 


Zvn= SS V (01, 02) V (02, 03)--- V(on-1, On) V (On, 01) (17.24) 


Therefore, the partition function can now be succinctly rewritten as: 
Zee TEN (17.25) 


On one hand, we have done nothing. We have merely reshuffled the summation 
within Z, by rewriting it as a sum over the 2 x 2 transfer matrix V. On the other 
hand, we have made an enormous conceptual difference, because we can now 
diagonalize the transfer matrix in terms of its eigenvalues; that is, there exists a 
matrix P that diagonalizes V: 


Ai O 
V=P pe (17.26) 
0 rz 


Substituting this into our original expression for the partition function, we now 
find: 


N 

‘11 O ‘ 

Zu =Tr ‘ =e (17.27) 
2 


Let A, be the larger of the two eigenvalues, which will then dominate the sum in 
the limit as N — oo. We then have: 


FULT ) 


-kT lim N~! log Zy = —kT loga, 
_kT log (eX cosh h + Ve2K sinh? h + e—2K ) (17.28) 


In addition to having an exact expression for the free energy, we also have an 
exact expression for the magnetization: 


e* sinhh 


Ve2K sinh? h + e-2% 


M(H,T)= (17.29) 
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This is an important result: We have obtained the complete solution for the free 
energy and magnetization in an exactly solvable statistical mechanical system. 
Because we have an exact expression for the transfer matrix, we can now solve 
for the correlation length and show that it goes to infinity when H = T = 0. To 
do this, we need to calculate the averages (o;) and (0;0;). We begin by defining 


the matrix S in spin space as: 
ne) 
S= (17.30) 
0 -!1 


S(a, 0') =08(0,0’) (173) 


which has elements: 


Therefore, the average can be wnitten as: 


(0103) = Zy) Yo V(o1, 02)V (2, 03)03-+- = Zy' TrSV’SVN~* (17.32) 


So: 


(o;0;) Z, Irsv!) ‘Sve 


(0;) 


Zo eSV (17.33) 


Now let the matrix P, which diagonalizes the transfer matrix, be parametrized 
by an angle ¢: 


a ( cos@ —sing jabs 


sing cos¢d 
Then we have: 
gij = (010;) — (9;)(9;) 
= cos?p+sin’ 26(A2/A1)’ — cos 2¢ 
ad sin? 26(A2/a1)? 
we J-O (17.35) 
So, we have the desired result: 


& =[log(A1/A2)] (17.36) 
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At H = 0, we have: 
Xr 
lim — =1 (17.37) 
T—0* Ay 


so & tends to co as H, T — 0. Thus, all reference to a length scale has disappeared, 


as expected, at criticality. 
Since the model has been solved exactly, it is now an easy task to calculate 


the critical exponents for the theory: 


yey 


Ising model : (17.38) 


3s -P DW NV 
H 


a= 
0 
0-0) 
1 


Another lesson that we have learned from this simple example is that the 
model was solvable because the partition function could be written in terms of 
a single matrix, the transfer matrix, which obviously commuted with itself. For 
more complicated models in two dimensions, we will find more than one transfer 
matrix, and the essential reason why some of them are exactly solvable is that 
their transfer matrices commute with each other. 

Now that we have some experience using the transfer matrix technique, let 
us tackle a nontrivial problem, the two-dimensional Ising model, which was firsi 
solved exactly by Onsager* in 1944 for the zero magnetic field case. Its partition 
function is given by: 


Zi — yey (« Yi oo;+L >> aa] (17.39) 


GJ) (i,k) 


where the (/, j) sum is taken symbolically over the nearest-neighbor horizontal 
sites on the lattice and the (i, k) sum is taken over the vertical lattice sites. 

Now rotate the lattice by 45 degrees so the lattice sites are arranged diagonally, 
as in Figure 17.2. 

Let us perform the sum over these rotated lattice sites first in the horizontal 
direction over n sites, and then in the vertical direction over m sites. Let W and 
V represent the partial sums taken in the horizontal direction. W and V alternate 
as we descend down the lattice in a vertical direction. Then the partition function 
is the sum of WV WV W . -- taken in the vertical direction. 


To sum the lattice sites horizontally, let @ = {o), 02, ..., O,}; that is, @ is the 
set of spins taken along a horizontal direction over n sites. Since each spin can 
take on two values, ¢ has 2” possible values. Let @’ = {o/, 05, ..., 0} be the set 


of horizontal sites just below @. 


17.2. The Ising Model sao 


Figure 17.2. In the two-dimensional Ising model, we rotate the lattice by 45 degrees. By 
summing horizontally across the lattice, we obtain V and W. Then Z, is the sum over 
VWVWVWV.-.. 


Then we can define W and V as follows: 


n 
Veo) = exp (Sekine + Laie) 
jel 
n 
Wig = Xp (SK + Laie}. (17.40) 
ja 


where W and V are now 2” x 2” matrices. As before, we can perform the sum 
over the two transfer matrices by summing vertically over the lattice: 


Zv = >>> ++ Y Vorb, Words *** Wont (17.41) 
Oo: f2 Om 


Written in matrix form, this becomes: 


ae 
Zy = Tay = a? (17.42) 


t=1 


where A; are the eigenvalues. In the thermodynamic limit, as we let the number 
of points n, m — oo, the partition function is once again dominated by the largest 
eigenvalue of the transfer matrix VW: 


lim Z (Amex) (17.43) 
n,m—oo 
The one-dimensional and two-dimensional Ising models are therefore closely 
related to each other, and the calculation of the free energy (which we omit) 
reduces to calculating the largest eigenvalue of the transfer matrix. 
We should also mention that there are a number of models that generalize the 
behavior of the Ising model and are exactly solvable. More important, there are a 
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number of models that, although they may not be exactly solvable, exhibit critical 
behavior that can be described by the known conformal field theories. Let us list 
a few of these models and their properties. 


17.2.1 XYZ Heisenberg Model 


Closely related to the Ising model is the X YZ Heisenberg model. Here, we replace 
the spin o; with a Pauli matrix. The Hamiltonian is given by: 


a = (Koff + SyoF OR + FOF ofa + ), Uae, 


oO 


where the sum is taken both horizontally and vertically over the entire lattice. 
We have different models for different values of J;: 


If J, = Jy = J,, then this is the usual Heisenberg model. 


If J, = Jy = 0, then only J, survives, and hence we obtain the usual Ising 
model. 


If J, = 0, then we have the X Y model. 


If J, = Jy, then we have the Heisenberg—Ising model. 


17.2.2. IRF and Vertex Models 


A large number of exactly solvable models can be grouped into two categories, 
the IRF (interactions around a face) and the vertex models, which differ by the 
way in which we place spins on a regular lattice. 

The IRF model includes the Ising model and many of the other exactly solvable 
models. If we place four spins a, b, c, and d (which can equal + | or 0 ) around 
the four corners of a plaquette, the energy associated with the plaquette will be 
e(a, b, c, d); so we define the Boltzmann weight of the plaquette as: 


w(a, b, c, d) = exp[—e(a, b, c, d)/kT] (17.45) 


For different choices of €(a, b, c, d), we can represent a wide variety of models. 
For example, the Ising model can be represented as: 


€(a,b,c,d) = = 5104 — 1)(2b — 1) + Qc — 1)Q2d — 1)] 


1 
= x7 [Rc — 1)(2b — 1) + (2d — 1)(Qa — 1)] (17.46) 
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and the eight-vertex model can be written as: 


€(a,b,c,d) = —J(2a—1)(2c — 1) — J'(2b — 1)(2d — 1) 
— J" (2a — 1)(2b — 1)(2c — 1)(2d — 1) (17.47) 


for a, b, c,d = 0, 1. The partition function for the IRF model is given by: 


(by = yaa S I] w(0;, Fj, OK, 7) (17.48) 


on i,j,k,l 


The other large class of models is given by the vertex models. For example, 
in ice, we have the molecules of water held together by electric dipole moments. 
Let us place water molecules on a square two-dimensional lattice, such that the 
line segments forming the lattice correspond to the electric fields, represented by 
arrows. 

These arrows have only two directions on any given line segment. If we 
impose the rule that there are always two arrows pointing out of and two arrows 
pointing into each vertex, then at any lattice site, there are six different possible 
orientations of the arrows. Each of these six different orientations will have an 
energy associated with it, called €;, fori = 1,2,...,6. Thus, if @ represents the 
lattice sites along a horizontal line, then we have the six-vertex model: 


Z= ey pe V(G1, 62) V (G2, b3)-° Vibu b1) = TEV = (17.49) 


where: 


Vo. 6!) = Drexp (-“ tee) (17.50) 


The partition function can be totally rewritten in terms of: 


k,l) = exp[ — e(i, j,k, D/kT] (17.51) 


wi, j 


Different values of €(i, j,k, /) correspond to different models. 


17.3. Yang—Baxter Relation 


The reason for the exact solvability of these models is that the transfer matrices, 
which define the partition function and free energy, commute. When expressed 
mathematically, this relationship becomes the celebrated Yang—Baxter relation.'° 
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In fact, mutually commuting transfer matrices, or equivalently the Yang—Baxter 
relation, are sufficient conditions for the solvability of any two-dimensional model. 
(One way to show this is that commuting transfer matrices give us an infinite set 
of conserved currents, which are sufficient to solve the system exactly. For more 
precise details and subtleties, the reader is referred to the literature.) 

Let us study the Yang—Baxter relation in terms of the vertex models. Let w(y;, 
a;|B;, 4i+1) fepresent the contribution to the sum from the ‘th site. Each Greek 
index, in turn, can have values of +1. Let us perform the sum horizontally, as 
before: 


Va.p = y oy De wie), 01/81, 2) W(2, O2|B2, W3)--- WC, On|BN, M1) 


Mi KAN 
(17.52) 
Let V’ represent another transfer matrix (with a different Boltzmann weight 
w’). Let us define the quantity: 


B, vv’) (17.53) 


S(u, v|u’, v'|a, B) = > w(u, aly, w')w'(v, y 
ye 


Then the matrix product V V’ can be represented as: 


N 
(VV a= > VayVie= >” do 7 | S@e wile: varies) (0754) 
Y i=] 


We can write this in matrix form by introducing the 4 x 4 matrix S(q@, 8), which 
is a(t, v) x (w’, v’) matrix whose elements are given by S(u, v|u’, v’ja, B). We 
can therefore write: 


Tr S(q@1, B))S(a@2, B2)--- Slay, By) 
Tr S'(a@, B1)S'(a2, Br): + S'(an, By) (17.55) 


(VV')a.6 


(VV), 


We now assume that V and V’ commute, so that the two previous expressions 
are identical. This is obviously possible if there exists a 4 x 4 matrix M such that: 


S(a, B) = MS’(a, B)M! (17.56) 
Let the matrix M have elements given by w’(u, v|u’, v’). Let us multiply the 


previous relation from the right by M, so we have SM = MS’. Written out 
explicitly, this matrix equation is: 


SS wu, aly, ph” w'(v, y B, v'w"(v", wl \v', pn’) 


a / 
Yue 
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Figure 17.3. The Yang—Baxter relation, shown here pictorially, resembles the topological 
relations found in knot theory and braid theory, which demonstrates the close relationship 
between exactly solvable statistical systems and topology. 


YS ww", uly”, ww!" aly, ww”, y|B, v') (17.57) 


yey 
If we redefine: 
wale”, y) = SHH (Cu), — w'(v, y|v", B) = SY (u + 0) 
wv" piv’) = SH) (17.58) 


then we can write the Yang—Baxter relationship in the form: 


DS Si (u) SEP (u + v)SP8(v) = D> SPP (wv) Si2 (uw + v)SZP(u) (17.59) 


a.B.y a,Byy 


If we graphically represent this relationship, then we find the pattern expressed in 
Figure 17.3, which pictorially displays the Yang—Baxter relation. 

In summary, the reason why many of these two-dimensional models are ex- 
actly solvable is because their transfer matrices commute, and the mathematical 
statement of this fact is the Yang—Baxter relation. The problem of finding exact 
solutions to these two-dimensional models then reduces to finding solutions to 
a much simpler problem, the Yang—Baxter equation. Fortunately, a variety of 
solutions to the Yang—Baxter relation exist. We notice that, as a function of the 
parameters u, v, the matrices appearing in the Yang-Baxter relation have a vague 
resemblance to the addition formulas for sines and cosines. By choosing an ap- 
propriate ansatz, we can, in fact, reduce the Yang—Baxter relations to the usual 
trigonometric addition formulas. (More precisely, it can be shown that a large 
number of solutions to the Yang—Baxter relation can be found using the addition 
formulas of what are called the “modular functions” 7, which are special functions 
found in solutions to certain periodic boundary value problems.) 

Each solution, in turn, corresponds to an exactly solvable statistical mechanics 
model. Thus, we have now stumbled upon a powerful way in which to catalog 
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the known exactly solvable two-dimensional models and also generate new ones. 
We will not present the explicit solutions to the Yang—Baxter equation for these 
models, since they are technically rather involved, so we refer the interested reader 
to the references. 

The Yang—Baxter equation, in turn, is intimately related to several other 
branches of mathematics, such as knot theory, conformal field theory, and guan- 
tum groups. The topological structure of the Yang—Baxter relation resembles the 
manipulation of strands of string. Hence, the Yang—Baxter relation can be reduced 
to the braid group relations found in knot theory. Thus, the relationship between 
knot theory and the Yang—Baxter relation gives us hope that a more or less com- 
plete classification of solutions to the Yang—Baxter relation may eventually be 
found. 


17.4 Mean-Field Approximation 


Unfortunately, the simplifications that exist in one- and two-dimensional sys- 
tems that allow us to find exact solutions do not generalize easily to three and 
four dimensions. The transfer matrix technique, the Yang—Baxter equation, and 
other techniques devised for one- and two-dimensional systems do not have sol- 
uble counterparts for higher dimensional systems. In fact, for years the two 
dimensional Ising model (with zero magnetic field) was the only exactly soluble 
two-dimensional system exhibiting a second-order phase transition. 

We now must leave the realm of exact solutions and postulate various approx- 
imation schemes, with varying degrees of success. We will study approximation 
schemes that have been proposed over the years, the simplest and most widely 
used being Landau’s mean-field approximation.’ 

The essence of the mean-field approximation is that we can substitute the 
actual field within a substance with an approximate, average field and ignore 
fluctuations. In practice, the mean-field approximation assumes that the magnetic 
field felt inside a substance equals the external magnetic field H plus an average 
field M, which we can calculate by minimizing the action. This assumption, of 
course, totally ignores the local fluctuations of the magnetic field throughout the 
substance, but it serves as a rough first approximation. 

By assumption, the mean-field approximation assumes that the magnetic field 
is equal to the external magnetic field H, plus the average field M, plus small 
corrections: 


H'=H+aM —bM’+.-.-. (17.60) 


where we ignore fluctuations. By assumption, the M? term is missing and b is 
small. 
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We also know that the average value of M obeys the Curie law, that is, M is 
proportional to the magnetic field H’ and inversely proportional to the temperature: 


a cH’ 


M 
sis 


(17.61) 


From these two simple assumptions, we can derive a wide variety of nontrivial, 
first-order results. Let us now solve for M: 


_cH' _ c(H +aM — bM?) 


M 
T T (17.62) 
Let us define T.. to be ac, and then we have: 
M(1 — T./T +cbM*/T) =cH/T (17.63) 


Now set H = 0 and determine the behavior of M below the critical temperature. 
The solution for M becomes: 


(17.64) 


From Eq. (17.13), this implies that 6 = 1/2. So the first critical exponent has 
been determined. 
Now let us take the derivative of M with respect to H, and assume that M is 


small so we can drop higher powers. Then the susceptibility becomes: 


>ye=l (17.65) 


where we have used Eq. (17.14). So we have now derived the second critical 
exponent. 

Now set T = T., so we are sitting at the critical temperature. In this limit, the 
magnetization M becomes very small. The dependence of M on H can again be 
calculated from Eq. (17.63), and is therefore given by: 


M~ H'P (17.66) 


From Eq. (17.18), this therefore gives us 6 = 3. 
In summary, we have calculated three critical exponents with very little effort 


by making the key assumption in Eq. (17.60). The mean-field approximation 
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gives us: 


2 
Mean-field theory : =o (17.67) 
Y 


Thus, the mean-field approximation gives us a wealth of information about a 
statistical system with very little physical input. Although the mean-field approx- 
imation does not give very reliable results for complicated systems, it does much 
better than one might first expect for relatively simple systems. 

We would like to justify some of these assumptions within the framework of 
an action and a partition function. We will analyze the mean-field approximation 
within the context of one of the most widely studied statistical systems, Ginzburg— 
Landau theory.’ 

To define the Ginzburg-Landau model in a way that resembles field theory, it 
is convenient to introduce the variable o(x) to represent the value of the spin at 
lattice site x. From a field theoretic point of view, the theory then resembles the 
¢* theory, except it has a linear term proportional to the external magnetic field: 


== i) dtx (ro%x) + fo%a)+e[Vow} -H-o(x)) (17.68) 


where, by the symbol [ d?x, we mean taking the sum over all lattice sites in the 
limit of small lattice spacing, and Vo denotes taking differences along the lattice. 

The mean-field approximation, in the context of this field theory, becomes the 
expansion of the action around a constant solution to the equations of motion, 
which gives us an average value of the field. In other words, the mean-field theory 
is based on the Born term of a perturbation theory. The mean-field approximation 
corresponds to tree diagrams, and the loop diagrams correspond to the fluctuations 
that we will ignore for the moment. 

To first approximation, we find that the solution to the equations of motion is 
given by a constant: 


o(x)=0 (17.69) 


But this also means, from Eq. (17.7), that the magnetization can be given in terms 
of the average spin: 


Mw~6 (17.70) 
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The gradient term in the action disappears, and then the equations of motion from 
Eq. (17.68) become: 


26 [ro + (u/2)a7] — H =0 771) 


In studying the various solutions of this equation, Landau observed that the 
point 7p = 0 marked a qualitative change in the nature of the system for small H, 
so that a phase transition was evident. So therefore we can make the statement 
that ro = 0 must be equivalent to T = T,. for this phase transition: 


fo = — 7.) (17.72) 


for some constant ¢. [Eq. (17.72) means that there can be spontaneous symmetry 
breaking at the critical temperature, since the sign of the mass term changes.] 
Inserting this back into the original equation, we find: 


2M (7 ee =m") =H (17.73) 


Now compare this with Eq. (17.60) postulated earlier. We find that there is an 
exact correspondence, and hence we can derive the critical exponents precisely in 
the same way as before. 

A more detailed examination of the Ginzburg—Landau theory in the mean-field 
approximation yields the following critical exponents: 


a=a'=2-d/2 
p=} 
Ginzburg-Landau model : yl (17.74) 
oo) 
n=0 


The approximation that we made, that the spin configuration that minimizes 
the action is the constant one, is called the Gaussian approximation, since all 
path integrals to lowest order become Gaussians. It is the particular form that the 
mean-field approximation takes for the Ginzburg-Landau model. 

Historically, when experimental results were not very precise, the mean-field 
approximation was a valuable theoretical tool that gave good explanations of 
the experimental situation. However, as the experimental results became more 
precise over the decades, it became clear that the mean-field approximation gave 
only a rough fit to the data. Attempts to go beyond the mean-field approximation, 
however, were met with frustration. New theoretical ideas were necessary to push 
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beyond the mean-field approximation. These new ideas came from scaling and 
the renormalization group. 


17.5 Scaling and the Renormalization Group 


Although the mean-field approximation gives us crude but reasonable fits to the 
data, it is difficult to go beyond the mean-field approximation and derive a per- 
turbation series. Treating the mean-field approximation as the Born term in a 
perturbation series creates new problems near a second-order phase transition. 

In general, at a second-order phase transition the correlation length € becomes 
infinite. At the transition, spins located in different parts of the system have a large 
effect on each other. This also means that specific features of the model wash out 
at the phase transition, giving us universality. Since € sets the basic scale of the 
system, at criticality the system usually loses all dependence on length; that is, it 
becomes scale or conformally invariant. 

This means that the behavior of the magnetization, susceptibility, etc. near 
the transition can be determined by the behavior of €. But since § ~ (T — T..)~”, 
this means that all critical exponents can be written in terms of more fundamental 
critical exponents, like v. 

For perturbation theory, however, this causes problems. In d dimensions, the 
coupling constant g has dimensions. Therefore perturbation theory can be based 
on the dimensionless quantity: 


ge" (17.75) 


However, near a phase transition, we have € — ov, so this clearly diverges 
if d < 4. The coupling constant becomes infinitely strong and the perturbation 
theory makes no sense. For d > 4, the crucial features of the phase transition 
often disappear, and the approximation becomes useless. For many years, this 
prevented a perturbative generalization of the mean-field theory near the critical 
point. 

However, it is possible to set up a new perturbation theory that is defined near 
the phase transition using the renormalization group. The new perturbation theory 
will be defined in 


ad=4— c (17.76) 
dimensions. (For example, for three-dimensional systems, € = 1.) 


To understand how the € expansion cures the usual problems of ordinary per- 
turbation theory, let us first use a few scaling arguments to derive the relationship 
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between critical exponents. The free energy F has.dimension equal to zero, since 
it is not affected by a scale change. The free energy per unit volume therefore has 
dimension d. Therefore, by the scaling hypothesis and Eg. (17.16): 


Piet af Ey" (17.77) 


From the free energy, we can calculate other physical quantities and their 
exponents. If we calculate the specific heat C, we find: 


2 


oF = 
(ea Tan ~ Oe = Yb ce 2 (17.78) 


It therefore follows from Eq. (17.15) that: 
a=2-— vd (TD) 


From Eq. (17.17), the correlator of two spins has dimension d — 2+ 7. The 
dimension of spin is therefore half that amount. From Eq. (17.7), we see that the 
magnetization has dimension dy = (d — 2+7)/2. Therefore we can read off its 
critical exponent: 


ee pean? 
em 8 Fe tle (17.80) 
Therefore, from Eq. (17.13), we have: 


p= 5vd -2+n) (17.81) 


We also know that the external field H, because M = —0F/0H, must have 
dimension equal to: 


dy =a —dya—(d +2—%7)/2 (17.82) 
Therefore, we also have: 
M ~ (H¥4n)™ (17.83) 
in order to make the dimensions match. From Eq. (17.18), this gives us: 


6 =dy/dy =(d+2—n)/(d —2+n) (17.84) 
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In summary, a few simple assumptions about the scaling behavior at criticality 
give us nontrivial relationships between the various critical exponents: 


a=a'=2-— vd 


Scali ee ae (17.85) 
Ccaling : A 
e S=(d4+2 — V2 


vev'=y/(2—n) 


To go beyond these simple-minded arguments, we will now use the method of 
block spins in order to calculate, to lowest order in €, the critical exponents. 

The block spin method of Kadanoff?:'° is based on some rather intuitive ar- 
guments. We know that, at criticality, the correlation length goes to infinity and 
many of the features of the model get washed out. At the phase transition, the 
partition function obeys a new symmetry. At first, this seem strange, since the 
lattice spacing between spins is equal to a, which is not scale invariant. However, 
at criticality, the system loses its dependence on a length scale and obeys highly 
nontrivial scaling properties. This allows us to write down the renormalization 
group equations for the system, using a prescription slightly different from that 
used in the previous chapters. 

Let us begin with a partition function Z(a) defined on a hypercubical lattice 
of length L with spacing a. At each lattice site, we have a spin operator o;, where 
i= 1,...,n. Now let us decompose this lattice into larger blocks of length 5, 
which is a multiple of a, so b = sa. We now perform the spin averaging within 
each larger block. This averaging within each larger block creates a new average 
spin o; with a new Hamiltonian. This will create a new partition function Z’(b) 
that is defined on a new lattice with lattice spacing b, such that the spin operator 
at the various blocks is defined to be the average spin o/. 

Now rescale the new partition function Z’(b) by simply reducing the lattice 
spacing from b toa. In general, the partition function that we get Z’(a) is not equal 
to the original Z(a) with which we started. This is because the two operations 
we have performed are quite distinct. The first operation integrated out the spins 
within each block to define a new lattice of length b, while the second operation 
was a trivial rescaling from b back to a. However, at criticality, when we lose all 
reference to mass scales, these two operations should be roughly inverses of each 
other. 

Let K, represent an operator that performs the first operation of averaging 
within each block of size b. The operator K, transforms the original Hamiltonian 
Ho) into anew Hamiltonian H’(o’) by averaging over the s@ spins in each block 
(we set k = 1): 


K, (A@)/T\=f ot (17.86) 
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More precisely, the action of the operator K, is given by: 


on MN o'NT _ / goat Ile (x, wad Yo] | |4e;.5 (17.87) 
ix y Sey 


where s~@ >, oi. tepresents the average spin over the s“ old blocks, replaced by 
the new block centered at x. This operation averages over all spins oj, within 
each block and replaces them with a function of spin o/ , defined at the center of 
each block. 

We will find it more convenient to work in momentum space. The spin oj,, 
where k represents the momentum, can be represented as: 


Cie Le ea) ce (17.88) 
Ge = eae ag (17.89) 
k<A 


where c is the site of the spin o;.. The momentum sum only extends as far as a 
sphere of radius A. (Beyond that momentum, we are probing a distance less than 
the lattice spacing, which is undefined.) 

In momentum space, the K, operation can be written as: 


oH /T = / e MOT TT doin (17.90) 
it,A>k>A/s 


We have split the momentum sum over k into two parts. The sum over 
A/s < k < A corresponds, in x space, to the sum over the spins with block size 
less than b, but larger than a. The sum over k < A/s, which corresponds to taking 
sums over blocks larger than b, is omitted in the block spin method. 

Next, we rescale the size of the lattice from b back to a, which means we also 
must rescale the following: 


o(x) — Os at 


x — x/s 


fas = st f atx (17.91) 


where s is the scaling parameter. 
The combination of the two operations, K, and the rescaling shown, gives us 
the renormalization group operator R,, which acts on the physical parameters of 
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the original theory, called collectively wz, and creates a new set yaks 
pw = Rp (17.92) 
where: 
RRs = Rss! (17,93) 


Now we apply the method of block spins to the Ginzburg-Landau action: 


H(ro, u, c) ad Ho + H, 
1 
Hg = 5 f ax [roo? + c(Vo)’| 
1 ye ed 
= = — 17.94 
H a 250 ( ) 


so that the set of parameters is uw = {ro, u, c}. 
The calculation is conceptually simple, but the details are a bit involved, so 
we will break it up into four steps: 


1. First, we perform the block spin integration, which converts H into H’: 


Cpe) yn iC) (17.95) 
where H’ has the same form as H, except that it is defined with parameters 
ro, u’, c’, and spins 0’. 


2. Then we go to the critical point, where wu is stationary, i.e. u’ = u. The 
solution of this gives us the critical value u*. 


3. Near criticality, we solve for rj. This gives us the behavior of rj near 
criticality: 
(ro — ro) = sro — rg) ++ (17.96) 
where the ellipsis includes terms of higher order in €. This gives us the value 
of v. 


4. Last, we will insert this value of v into the scaling relations in Eq. (17.85), 
which gives us the remaining critical exponents. 
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17.5.1 Step One 


To begin, we will perform the sum by splitting up 0; into two pieces: 


Oo; = oO; +6; 

of = La? y oj ,0** 
k<A/s 

= =i ae 

= S G;,ge'"* (17.97) 
z-A/s<q<A 


where we have split the sum over momentum space into two parts. The purpose of 
this split is that the summation over the blocks with less than size b is performed 
by summing over the spins within A/s < gq < A. Thus, we are only interested in 
the summation over 6;, while keeping o/ constant. 

After performing the averaging over 6;, we are left with a new Hamiltonian 
defined totally with the variable o/. This new Hamiltonian H’(o/) will have 
parameters rj, u’, and c’ that we want to calculate. 

We will only perform the calculation to lowest order, so we will power expand 
in H,. After performing the block spin summation over &, H changes into H’, 
where: 


i 
H = ; [ats (wio'? +c'(Vo'? + (0) Hees 


(4; — (H))") 


= 
2 
= Hi+A+B (17.98) 


= R,H=H)+(H,) 


OK —s! —nl2q, 5 


Our problem, therefore, is to average over 6, which leaves us with a modified 


Hamiltonian H’ defined in terms of o’, which in turn allows us to compute r4 and 


ie 


The key to the calculation is therefore to compute A, B: 


A = (H)=3 f déx (04 


uf atx ad 
128 y 


x ([o4x) - (@)] [o*o - o40))]}) 17.99) 


A and B, after summing over 6, are functions of (o’)* and (o’)*, which give 


corrections to ro and wu. 
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To perform the averaging over the block, we will use the following relations: 
(a) =a): m odd (i504) = Or Ona Ore ora) (17.100) 


The summation is over a Gaussian, and is hence calculable. From the Hamilto- 
nian in Eq. (17.94), we know that the two-point function is given by the usual 
propagator, which is 1/(ro +cq’). Then the summation over the block spins from 
A/s <q < Acan be performed by taking the continuum limit: 


Da eS (6i,q6i,—q) 


A/s<q<A 


= n(ony# f dg 


ro +cq? 


A eee 
= nk, | dq 5 
A/s To + cq 


= a(l—s-") =mkac Tolono) (17.101) 


where d¢q = q4~'dqKa(2m)* and: 


(n/c)KaA4~?/(d — 2) 


Nc 

Kao = 27 a) (17.102) 
where K, is the surface area of a unit d dimensional sphere, divided by (277), 
andd=4-e. 


The point of listing these identities is to find an expression for (04), which 
can now be written as: 


(o*) 


([o’ + 20’ 6 + 6’) 


(o')* + 2(0’)? (6?) + 4((6 -0’)?) + (64) (17.103) 


Only the second and third term give a contribution to (0). 
The third term can be written as: 


(@ 0?) 


| 
bg) 
a 
cy 
a 
I 


o” [(n-/n) (1 — s?~4) — Kgc~*rologs] (17.104) 
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Now insert Eqs. (17.101) and (17.104) into (17. 103) and collect all terms contain- 


ing (o’)*. Inserting these averages back into the expression for A in Eqs. (17.99) 
and (17.98), we find: 


ro = |neo/omr + 1)(A?/2)Ka(1 = see 


— ro(u/c?)(n/2 + 1)K, logs ved Ole) (17.105) 


(where the C and D terms are not important to the final result). 
Next, we wish to calculate the term B in Eq. (17.99). This is also straightfor- 
ward. We define: 


8;,G(x — y) = (6;(x)6;(y)) (17.106) 


We therefore have: 


G(r) = L-? D> eg sce 
A/s<q<A 
= (Qx)?r~7c7! [Jg(Ar/s) — Jo(Ar)] (17.107) 


where Jo is a Bessel function, and: 


! 
& 
Se 
Q 
=_ 
(53) 
Rey 


5 Kie 
far G(r) = pity si a 
A/ 


= (17.108) 


If we expand the terms in B in Eq. (17.97), we find a large number of 
extraneous terms. After summing over 6, the only terms that survive are of the 
form (’)*G? and (o’)?G?. All summations over the & can be therefore performed 
over the block spin. When the summation is performed, we find that we have a 
new Hamiltonian H’, which has the same form as the original Ginzburg-Landau 
Hamiltonian, but is now a function of the spin operator o/, with coefficients 
ae : 

Inserting these summations back into Eq. (17.99), we find that the (o’ eG 
contribution to B gives a correction to u: 


u’ = s* [wu — (u?/2c*)(n + 8)K4 logs] (17.109) 


Now that we have explicit results for rj and u’, this completes the first step of our 
calculation. 
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17.5.2 Step Two 


To complete the second step, we go to criticality, where we have u = u’ = u*, 
since u is invariant under the scale transformation. Imposing this restriction, we 
can now solve for u* from the previous equation: 


2€ 


oa (17.110) 


uct 


We can also solve for the value of rf at criticality by inserting Eq. (17.110) into 
Eq. (17.105) and ignoring higher-order terms: 


ro = —(u*/c)(n/2 + 1)(A?/2)K4 
n+2 
n+8 


€(A2c/2) (17.111) 


where we have inserted the value of u*. 


17.5.3 Step Three 


The third step consists of calculating the value of rj as a function of s. Inserting 
the value of u* and rf into 74, we have: 


(79 =16) =s"'Go 7G) (17.112) 

where: 
a=9—"t2,01 17.113 
US n+8  v eee 


This fixes the value of v to be: 


1 (n+2) 


ay (17.114) 
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17.5.4 Step Four 


Finally, we now insert this value of v into the scaling relations in Eq. (17.85) to 
obtain the rest of the critical exponents. To this order, we have n = 0; therefore: 


a = (4—n)e/2(n + 8) 
B= 5 —3/2(n +8) 


j y=1+(n+2)e/2(n + 8) 
Block spins : C175) 
6=3+€ 
n=0 


v=5+(n+2)e/4(n + 8) 


17.6 © Expansion 


We have seen the importance of the emergence of a new symmetry, scale invari- 
ance, at criticality because the system loses all reference to a length scaie at the 
phase transition. This also means, however, that we can use an alternative method 
of deriving these identities, equivalent to the method of block spins, which is the 
familiar Callan—Symanzik equations. The usual Callan—Symanzik relations allow 
us to calculate the critical exponents to arbitrary order in € by calculating loop 
diagrams." 
In familiar field theory language, we start with the action: 


Co 23 a a 
Se pace y+m?*o *| 


Z pe WA, 
-8Z Le] - Fom'o"y (17.116) 
where we sum over a = 1,..., N and where Z and Z3 are the usual renormaliza- 


tion constants that correspond to the four-point and two-point functions. Notice 
that we are taking a Euclidean metric, not a Minkowski metric. 
Then we can immediately write down the Callan—Symanzik equations for the 


S-point function: 
(ms + Ble); = 7) r® = AT® C17) 


where we will omit the right-hand side in the asymptotic limit that we are analyz- 
ing. 
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In this asymptotic limit, we can solve the Callan—Symanzik equations in the 
usual way, and we get: 


1 ee) 
Oe os: es / 


xO (pi;m, 9(A)) (17.118) 
where g(A) is defined by: 
g(a) dg’ 
& 
—— =loga (17.119) 
I pe) 


However, as we mentioned earlier, this solution is only formal, since the 
perturbation theory in g near the critical point does not make any sense, since the 
dimensionless quantity in which we expand is g€7—4, which blows up for d < 4 
near criticality. This is the reason why, before renormalization group arguments 
were developed, the mean field approximation could not be generalized properly 
near a phase transition. 

Thus, we want a revised set of equations defined as a perturbation series in a 
new, dimensionless quantity called u: 


u= eee * (17.120) 


In terms of the new dimensionless variable u, the Callan—Symanzik equations 
are almost identical, except that the independent variable is now given by u. We 
therefore have: 


Ou 
Blu) ne 
M \ go 
d log Z 
Ao) (17.121) 
om 80 
and the bare coupling constant go is related to the renormalized one by: 
Z\(u) 
80 =8 (i 7a? 
Z3(u) 
In terms of Z;, we can write: 
d uZ\(u) oy 
= —e/[ —1 
a si (5 = “| 
d log Z3(u 
ne) = — (17.123) 


du 
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To make contact with the usual physical variables, we define the standard 
renormalized two-point function as the mass squared, and four-point function as 
the coupling constant at the point p? = 0: 


(2) : = 2 
"(p, — py, W)| 229 = mM 
§ 12 ae: )| = Fl 
Qpz ee eee ey = 
T%(O,0,0;m,u) = g (17.124) 


where all vertices are defined as a function of u. 

In terms of this new variable, we can extract the scaling property from the 
solution to the Callan—Symanzik equations. We are interested in rescaling the 
momenta p; — Ap; in the asymptotic limit A —+ oo. From simple dimensional 
arguments, we know that naive scaling gives: 


T"(upprngu) = a2 9-92 O (pom /d, u) (17125) 


So far, everything resembles the ordinary field theoretic discussion. To begin 
the calculation, let us assume that we are near a fixed point, such that: 


B(u*) =0 (17.126) 


Then the y3 term in Eqs. (17.117) and (17.118) contributes to the asymptotic limit 
near u*, such that: 


PO Op;; m, u*) re pl4—s(d—2)/2—sys(u*)/2] (17.127) 


However, we know from the definition of the critical exponents that the two-point 
function scales as: 


PeOp) A (17.128) 
where 2dy is the anomalous dimension of ¢. From the chart Eq. (17.19), we 
can compare this with the two-point correlation function’s asymptotic behavior 
at criticality. Equating Eqs. (17.127) and (17.128) and setting s = 2, we have 
2dy = d — 2 +n, and thus: 


n= 3(u") (17.129) 


Our strategy to calculate the critical exponents is now as follows. 
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1. First, we calculate the two- and four-point Feynman graphs necessary to 
evaluate Z, and Z3. 


2. We insert these values of Z,; and Z3 into Eq. (17.123) and calculate B(u) and 
y3(u). 
3. We calculate the critical point u* by setting B(u*) = 0. 


4. Finally, we insert u* into y3, and use the relation y3(u*) = 7 to calculate the 
critical exponent. By similar arguments, we can also calculate y. With n and 
y, we can calculate all the critical exponents via the scaling relations in Eq. 
(17.85). 


The Feynman rules corresponding to our action are easy to calculate. The 
propagator, for example, is just the usual 1/(g? + 1). A direct calculation of the 
renormalization constants for the two- and four-point functions yields: 


2). =eelee Sun au: 
2 

2 (nmi +26n+108 , (n+22)c 3 

+ (Su) (ae a aon a + O(u’) 

= 2 2 
a 1+ 18 (Su) b 
+8)(n +2 
er (su) (ab —d/2)+ Ou’) (17.130) 


where S = 24/2 /T'\(d/2)(27)4 and where values of the loop parameters a, b, c,d 
are given by Feynman’s rules: 


eas sar? @ qn 
b = 
eae p?=0 
. | d%q, d4qp 
qi +D@i+D[(p+an+Q +1] 
ee | _aigndtg, 
(22s (q? + 1)°(q@3 + 1) [Ca +4q2)? + 1] 
sl 1 d 
~— $3(2)34 dp? | p20 


2 
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These integrals can all be performed using dimensional regularization. We find: 


a = al —~Ae2).08) 


sees 
1 4€ 


b= — eo 8 Oe) 
oS 57 [1 -€/2+ 0(€)] 
1 1 
d = ~ga | je 01] - 1/046 (17.132) 
where: 
1 log x(1 — x) 
pe aa ee ey ee A 17.133 
i Sa 


We can now calculate 6 and y3 from the definitions of the renormalization- 
group variables in Eq. (17.123): 


plu) = ~u(e- "2c — e/2y(su) 
3n+14 2 "i 
ee) ) +0 ) 
Wii = =e *=(su)?(b +(Su)” : Cane 3d/2)) + O(u4) 
(17.134) 
Solving B(u*) = 0 at criticality, we find: 
.. 6 1, 3Gn+14) : 
Uu “WG 8) late( 5 + Samer )] + 0% ) (17.435) 


If we now plug the value of u*(€) into y3(u) in Eq. (17.123), we then have a 
power expansion of 73 = 7 in terms of €, as desired. 

In addition to 7, we must also evaluate one more critical exponent. In order to 
determine all the critical exponents, we would also like to solve for the exponent 
y, which determines the critical behavior of the magnetic susceptibility. The 
susceptibility is defined as the second derivative of the free energy, and hence 
we want to calculate the anomalous dimension of the composite operator $7(x). 
Although only Z, and Z3 are sufficient to renormalize the original action in 
the usual way, we will find it convenient to introduce another renormalization 
constant Z4, which is the renormalization constant that appears in the vertex of 
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¢*(x) coupled to two other fields: 
($7(x)b(y)G(O)) = Za(u)(P7(x)G(Y)H(0)) bare (17.136) 


We will therefore find it convenient to introduce the vertex function ['~?: 
Tq, Pist**Psim,u) = peste naan Gxpeedx; 
x (P(x) (x1) + @C4,))| (17.137) 


As before, we can show that this new vertex function also satisfies a Callan— 
Symanzik relation: 


(ms. + BW = (3/2 = 1ys@)— yas) rea = (17 88) 


where: 


0 dlogZ 
y= m— log Z,| =$@)——" (17.139) 


7) 80 du 


We can now extract the asymptotic behavior of this vertex function as we 
rescale the moments p; — p;A: 


TO (ag; Api) © AEE) le") (17.140) 

However, from general asymptotic arguments, we also know that the asymptotic 
behavior of the vertex function is governed by the anomalous dimensions dg and 
dy: 

Cy Vs (17.141) 
So we obtain from Eqs. (17.140) and (17.141), setting s = 2: 

dyz = 2dg — ya(u*) (17.142) 
From this and the definition of y in Eq. (17.19), it can be shown that!!: 


2 
ay 


(17.143) 


This is the desired relationship between the critical exponent y and the anomalous 
dimension of ¢?. 
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Now we repeat the same steps as before. Using Feynman graphs, we can 
calculate Z4 in terms of a, b,c, d. We find: 


Zoi a6 —a’)(Su) + O(u3) (17.144) 


From this, we can derive an expression for y4: 


y4(U) =~ ** (Su) [—a + (Su)(2c — a*)] + OW’) (17.145) 

To sum up, our strategy, as we mentioned earlier, has been to calculate Z; and 
Z3, which gives us B(u). By solving for B(u*) = 0, we can calculate u*(e) at the 
critical point. We insert w*(€) into the expression for y3.4, that gives us a power 
expansion for 7 and y. Then, by the scaling relations, we can determine all the 
critical exponents. 

This expansion can be carried out to arbitrary accuracy in €. We list some of 
the critical exponents which have been calculated out to fourth and fifth order'': 


—  (n-4) (n +2)? 2 (n +2) 
« = — jee cee 2 gees 


[n 4 +50n? 


+ 920n? + 3472n + 4800 — 192(5n + 22)(n + yr |e’ + O(e*) 


— 


: 3 (n+2)(2n+1) ., +2) [, 5 2 
p= on fey CS eee (7 es 


+ 488n +848 — 48(5n + 22)(n + 8)T |e? + O(e4) 


= (n + 2) Geo > 9. (eF2) 
A Nr a 


6+ 44n? + 664n72 + 2496n + 3104 — 96(5n + 22)(n + 8)T |e’ = O(e) 


Se 6 ee OO 


1 4 3 
Xn ok din + 8) n(n + 30n 


1 
4 34 —___1 9n® +. 96n? + 1778n* 
+ 276n? + 1376n +3168) ra n® +96n n 


+ 12760n? + 50280n2 + 147136n + 263040 
+ 768(n + 2)(n + 8)(5n + 22)T |e* ABLE) 


(n+2) 5. (n+2) 


= ae a(n One 2 ee 
” me eae 


. ae | — 5n* — 230n? + 1124n” + 17920n + 46144 
32(n + 8) 
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168(5n + 22)(n + 8)r |e’ + O(€) 


_ 1, +2, +2) 
v= 2° 448) 88): 
(n + 2) 

+ ————_ 
32(n + 8)> 


— 192(5n + 22)(n + 8)T Je? + O(E*) 


(n? + 23n + 60)e? 


[2né +89n3 + 1412n? + 5904n + 8640 


(17.146) 


where T = 0.60103. 

To check the reliability of the methodology, we compare the perturbation 
calculation at « = 1 with the high-temperature series calculations for the three- 
dimensional Ising model, using the € expansion and the Landau theory: 


[ose o 
(sel ee: 
eae | 0.5 
em. 


(17.147) 


0.077 | 0.125 +0.015 
0.340 | 0.312 £ 0.003 | 05 | 


4.46 5.150 2.002 


The € expansion is taken to second order, except for n, which is taken to third 
order. The agreement is surprisingly good even for € = 1. 

(We caution, however, that the € expansion is not convergent but only asymp- 
totic. The convergence properties of the € expansion are not fully understood.) 

In summary, in this chapter we have seen how phase transitions can be cat- 
egorized according to their critical exponents. In two dimensions, a large class 
of exactly solvable statistical models exist. The reason why they are solvable is 
because of commuting transfer matrices, or the Yang—Baxter relation. Unfortu- 
nately, many of the techniques used to solve these two-dimensional models do not 
carry over to four dimensions. 

The mean-field approximation has been one of the main ways in which to 
extract qualitative features of more complicated statistical systems. However, 
trying to treat the mean-field approximation as a Born term to a power expansion 
fails because, at criticality, the coupling constant g&*~@ is large. Fortunately, 
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the renormalization group method allows one to expand in u = gm?¢—*. We can 
then extract meaningful relations by solving for B(u*) = O near criticality and 
inserting these relations back into the scaling relations. The results, even for the 
three-dimensional case (€ = 1), are surprisingly good. 


17.7. Exercises 


1. Show that (6*) = (n? + 2n)(n./n/(1 — s?~¢) for the Ginzburg—Landau 
model. (Hint: use Wick’s theorem.) 


2. Consider the one-dimensional partition function: 


a 1 
“= (| i Sm €XP (- 3653 - Ws) exp (« 2S Darn] (17.148) 


where the spins s, are arranged discretely along a line and they can assume 
any real value. Take the limit u — oo and b > —oo with b = —4u. Show 
that this model becomes the familiar Ising model in this limit [if one puts in 
a factor (u/7)'/* exp —u per spin]. (Hint: show that we recover a Dirac delta 
function condition on the spin S,.) 


3. In the limit u — 0, this becomes the Gaussian model. Why is it exactly 
solvable? Rewrite the Hamiltonian totally in terms of the Fourier transform 
Og = >, exp(—iq-n)s,. Show that the term appearing in the partition 
function: BH = K ~, >>; SnSn4i — 30D, 52 can be rewritten as: 


1 eet d*q 
BH = aAG lexntia) ~1P 400-2489] Ora 
iy eee d4q 
as [a +1909 (17.149) 


where K is rescaled to one, and r = (b — 2dK)/K and |q < 1|. 


4. For the Gaussian model, the two-point function is P, = 1/(q? +7). In x 
space, we have: ['(x) = ie e'@*Td4q/(2)*. Show that this gives [(x) ~ 
exp(—v/r|x|) and hence € ~ 1/,/r. Given the form of r with K ~ 1/kT, 
show that v = 1/2. 


606 Phase Transitions and Critical Phenomena 


5. Consider the Ising model in d dimensions using mean-field theory. The 
partition function is given by summing over nearest neighbors 1, /: 


ih — ee (: Ds) (17.150) 
ij 


spins 


The magnetization is given by M = (s;). In the mean-field approximation, 
assume that all neighboring spins are replaced by their average value M. 
Then the sum over nearest neighbors picks up the factor 2dBM. Show that 
the Boltzmann probability for the ith spin to have the value s; is given by: 


pe exp(2dB Ms;) 


= 17.151 
2 cosh(2dB M) ( ) 


(Hint: perform the sum over 2d neighboring sites, and treat the denominator 
as a normalization factor.) 


6. For this Ising system, assume that the average of s; is also M. Show that this 
gives us the self-consistency equation for the mean-field approximation: 


M = (s;) = tanh(2dBM) (igus) 


For small 6, the unique solution is M = 0. For larger f, at a certain point 
there are nonzero solutions for M. Show that this phase transition takes place 
at B > B. = 1/2d. This crude assumption agrees remarkably well with 
the correct result, especially for larger d. (Hint: plot the equation for M 
graphically for various values of 6, and show that a phase transition occurs at 


Be.) 
1 Prove Eq e738): 
8. Fill in the steps in Eq. (17.101). 
9. Prove Eqs. (17.105) and (17.107). 
10. Draw the Feynman graphs that correspond to Eq. (17.131). 
LProve Eq. (172932): 


12. Nonperturbative information can be extracted from SU(N) gauge theory in 
the limit that N — oo. Write QCD in the fundamental representation, so 
that the gauge field is written as Ai» for a,b = 1, 2,3, since the adjoint 
representation can be written as the product of 3 and 3*. Show that the QCD 
action can be written as follows: 


N 1 


L= ‘a eae + Way" (i055 + At, Ww? — mow | (17.153) 
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i 


14. 


where: 


F meas, — Oy A+ (AT AG, = Al. Ay.) (17.154) 


and Ab, is a Hermitian, traceless matrix, such that A“, = 0 and A Seah 
Show that the gluon and quark propagator become: 


(0|T [A%,(x)ASg(y)] |0) 


re 1 asc 
(ai85 - x°88i) Dyy(x — y) 


(O|T [v7(x)vo(y)] |0) 5¢Sr(x — y) (17.155) 


Consider the large-N limit with g? N held fixed. Consider a vacuum Feynman 
diagram of very large order. It has the shape of a large polyhedron, with F 
faces, V vertices, and / internal lines. Using Feynman rules, this polyhedron 
corresponds to a Feynman diagram with / propagators, V vertices, and F 
traces over internal lines. Show that whenever we trace over a loop (face), we 
pick up a factor of N, since 6% = N. Show that each gluon vertex contributes 
a factor of N, and that each internal line J contributes a factor N~'; that is, 
show that: 


Faces — WN 


Vertices — WN 


Lines — N7! (17.156) 
Show that the Feynman diagram for this polyhedron has the overall factor of: 

is es (1 7aleg) 
where y is called the Euler characteristic of a polyhedron. It is a topological 
invariant. 


Now show that we can envision the vacuum Feynman graph as a sphere with 
H handles (holes) and B boundaries, where the surface of the sphere is trian- 
gulated by a large number of triangles making up the vertices and propagators 
of a Feynman diagram. Show that the Euler characteristic becomes: 


x=2-2H-B (17.158) 
Show that the leading vacuum graphs behave like N*. They are topologically 


equivalent to spheres with no handles or boundaries (H = B = 0). They 
have no fermion lines (since fermions punch holes in the sphere, and thereby 
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decrease the Euler number). They correspond to purely planar diagrams. 
At the next order N, show that we have planar surfaces bounded by closed 
fermion lines; that is, we have a sphere with one boundary B = 1. At the next 
order O(1), we have either a sphere with a hole (i.e., a doughnut), or a sphere 
with two boundaries (i.e., a disk with an inner and outer boundary). Comment 
on the physical meaning of the N — oo limit, in terms of the bound states of 
the theory and the gluon “strings” that form between quarks. 


Chapter 18 
Grand Unified Theories 


We present a series of hypotheses and speculations leading inescapably 


to the conclusion that SU(S) is the gauge group of the world... 
—H. Georgi and S. Glashow 


18.1 Unification and Running Coupling Constants 


The Standard Model successfully incorporates all the known properties of the 
strong, weak, and electromagnetic forces. In fact, there is not a single experiment 
in particle physics that contradicts the results of the Standard Model. Its weakness, 
however, is that it is ad hoc: It has too many arbitrary parameters (especially 
quark masses) and absolutely no interaction with the gravitational force. Since the 
various interactions are simply spliced together, one feels that a more fundamental 
theory should be possible. 

One improvement on the Standard Model are Grand Unified Theories (GUT). 
GUTs also share many of the weaknesses of the Standard Model (e.g., too many 
arbitrary parameters, no interaction with gravity). However, they are genuine 
unified field theories because there is only one gauge group and hence only one 
coupling constant. Furthermore, they make a prediction that is now the subject of 
several ongoing experiments: the decay of the proton. 

One of the most compelling arguments for the unification of these forces 
comes from asymptotic freedom. We deduced previously that the B function for 
Yang-Mills theory can be written as: 


eid 2 
B(g) = -7 Ew S 3, | oe (18.1) 


for a SU(N) gauge theory coupled to Ny fermions transforming as the N- 
dimensional representation of the group. 
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Figure 18.1. The strong, weak, and electromagnetic running coupling constants plotted 
against the energy. At the GUT scale, all three coupling constants seem to merge into one. 


At extremely high energies, the Callan—Symanzik relation shows that the three 
coupling constants of the strong, weak, and electromagnetic interactions begin to 
converge, leading us to suspect that all three interactions become part of the same 
interaction at a very high energy. 

For the strong, weak, and electromagnetic interactions, respectively, we have 
three distinct equations for B(g)!: 


3 
A one _2 
Px(g3) = —TE5 (1 5) 
3 
_ Sa pee 
Bo(g2) = #5 (5 Nr) 
ae 
Bi(gi) = Nye (547) (18.2) 


where we have set the number of Higgs particles to zero. All three equations can 
be summarized as: B;(g;) = bi g?/167*, which then determines the value of bj. 

Let us assume that there is a mass scale, governed by My, where all three 
coupling constants converge: 


ai(Mx) = a2(Mx) = 03(My) (18.3) 


where a; = g?/4z. 
Then the solution to the renormalization group equation is given by (Fig. 
182): 


Se ae ie ek (18.4) 
ai(u) a(My) In” pw 
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Unfortunately, this shows that the unification of these three coupling constants 
takes place at an incredible energy scale of 10'° GeV, which is far beyond the 
ability of any accelerator to probe. Between 1 TeV and 10!* GeV, the simplest 
GUT theory predicts that there will be a large energy “desert” stretching across 
12 orders of magnitude in which no new interactions will be found. 

Although the existence of this enormous desert is one of the main criticisms of 
the theory, one attractive feature of this approach is that we have an experimental 
handle by which to verify such models, and this is proton decay. Since GUTs 
generically put leptons and quarks in the same multiplet, then the vector mesons of 
the theory will in general mix up these leptons and quarks, thereby mediating the 
decay of the quarks into leptons and hence producing proton decay. Since proton 
decay can be measured in the laboratory, this give us an experimental handle by 
which to accept or eliminate this approach. 


18.2 SU(6) 


One of the earliest GUT models was that of Pati and Salam.* Perhaps the most 
conservative choice for a model beyond the Standard Model is the “minimal” 
SU(5) theory.’ The Standard Model, with the gauge group SU(3)® SU(2)@U()), 
has four diagonal generators, corresponding to T3, tg of color and t3 and Y of weak 
isospin. The minimal choice beyond the Standard Model is a rank 4 group. The 
complete set of rank 4 groups involving just one coupling constant can be easily 
written down. There are just nine of them (including products of identical Lie 


groups): 


Stayt OG) SUG) 
(G2.)* O(8) O(9) (18.5) 
Sp(8) F, SU() 


We also want groups with complex representations, because the complex con- 
jugate of a field transforms differently from the field itself in the Standard Model. 
Of these, only SU(5) contains the Standard Model's gauge group with the proper 
complex representations of quarks and leptons. There are phenomenological 
problems with all of these groups except for SU(5). 

In addition to being the minimal model compatible with complex representa- 
tions, there are several other rather remarkable properties of SU(5) that make it 
physically attractive: 


1. It is free of anomalies. 
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2. It gives precisely the correct quantum number assignments for the 15 left- 
handed and right-handed quarks and leptons found in the Standard Model for 
one generation. 


3. It gives, after radiative corrections, a reasonably good approximation to 
pe 
sin* Oy. 
4. It gives a scenario by which the model breaks down to the Standard Model 
via the Higgs mechanism: 


SU(5) — SU(3) ® SU(2) @ UT) — SU(3) ®@ U1) (18.6) 


Let us now study each of these features of the GUT theory. 


18.3. Anomaly Cancellation 


Very few groups and their representations give a cancellation of the chiral anomaly, 
but SU(5) gives such a cancellation with precisely the correct number of quarks 
and leptons. 

To analyze the representations of SU(5), we first remind ourselves that the 
anomaly is proportional to: 


Teper yc (18.7) 


All representations of SU(N) can be found by tensoring the fundamental 
representation (hence the name) and then taking the various symmetric and anti- 
symmetric combinations of the indices found in the Young Tableaux. If we take 
the antisymmetric representations [N, m] of SU(N) (corresponding to a vertical 
stack of m boxes), then it is easy to see that their dimensionality is the number of 
ways we can take N things m at a time: 

dim {IN N! 
im[N,m]= eT (18.8) 

Furthermore, we can plug this fully antisymmetric representation of the gen- 

erators of SU(N) into the anomaly condition, and we arrive at: 


Aye = N= BN = 2m) 


~ (N—m—1)\(m—- 1D! Gee) 


Now let N = 5. The fundamental representation is 5 with m = 1. If we 
multiply two of these together, then we can rearrange them in symmetric and 
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antisymmetric combinations: 
5@5=10015 (18.10) 
If we take m = 1 or 2 (corresponding to 10), or 3, we find that they have 
the same anomaly contribution. Therefore, the anomaly contribution of a right- 
handed 5 precisely cancels a left-handed 10. For the fermion representation, 
anomaly cancellation demands that we take: 


Fermions: 5p +10,, (18.11) 


The anomaly also cancels for the following combinations of [N, m]: 


SU(5): [S, 1) Giisel 
SU(6): 2[6, 1] @ [6, 4] 
SU(7): [7, 2] 6 [7, 4] 6 [7, 6] 
SU(8) : [8, 1] @ [8, 2]  [8, 5] 
SU(9): (9; 21385) 
or : [9, 1] @ [9, 3] @ [9, 5] @ [9, 7] 
SU(10) : [10, 3] @ [10, 6) (18.12) 


which, of course, does not exhaust all possible anomaly-free combinations. 


18.4 Fermion Representation 


Anomaly cancellation by itself is not so remarkable, since many other represen- 
tations can achieve this. What is remarkable about this construction, however, 
is that the 5 @ 10 representation contains precisely the correct quantum numbers 
necessary to retrieve the Standard Model. 

In the Standard Model, if we count the number of chiral fermion modes, we 
find that we have 12 modes from the u' and d' quark sectors, 2 modes from the 
electron field, and 1 from the massless neutrino, for a total of 15 modes. 

However, since SU(5) has no 15-dimensional representation, we must split up 
the fermions into two parts, the sum of a 5- and 10-dimensional representation. To 
accomplish this, we take the 5 to be right handed and the 10 to be left handed, so 
that 5 and 10 are both left handed. But this, however, is precisely the anomaly-free 
combination that we just computed. 
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This 5-dimensional spinor, transforming as [5, 1], is given by: 


v= | ea? (18.13) 
R 
If we break this down into the representations of SU(3) ® SU(2), we have: 
5 =(3,1)6(b2) (18.14) 
where the quarks correspond to (3, 1) and the leptons to (1, 2). 


We will now take the left-handed 10 to consist of an antisymmetric [5, 2] with 
two antisymmetric indices: 


O- eeAll SiN! 

0 Se Uu d 

etl 0 us —y* =—d? 

= c Cc 3 ) 

Vio = U5, —uy 0 -u’ —-d 
u! u* Ve 0 —et 


10 


3,203,HDeqa,D (18.15) 


where the quarks correspond to (3, 2) ® (3, 1) and the electron to (1, 1). 

Likewise, the gauge mesons transform according to the adjoint representation 
of SU(5), which has 24 elements. The breakdown of these elements in terms of 
SU(3) ® SU(2) is given by: 


24 = (8, 1) 6 (1,3) 6, 1) 6 GB, 2) 6 GB, 2) (18.16) 


From this decomposition, we can identify the gauge mesons corresponding to the 
Standard Model. The (8, 1) corresponds to the usual colored gauge bosons of 
SU(3)-. The (1, 3) 6 (1, 1) mesons correspond to the W,,, Z,, and the electro- 
magnetic field. Finally, the (3, 2) @ (3, 2) are new gauge mesons, which we call 
the X and Y vector mesons, which couple the quarks to the leptons and hence 
mediate proton decay. 

Next, to extract any meaningful phenomenology from this model, we need a 
specific representation of the A? matrices, from which we can identify the charge 
and different isospin operators. 

There exists, of course, an infinite number of ways in which we can choose 
these matrices. For convenience, we will take the following representation. Let us 
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break up a5 x 5 square matrix into four blocks, consisting of two smaller squares 
and two rectangles. The upper left-hand corner block will be a 3 x 3 submatrix. 
The lower right-hand corner block will be a 2 x 2 submatrix. In the diagonal 
upper right- and lower left-hand corners, we will place 3 x 2 and 2 x 3 rectangular 
submatrices. 

We will adopt the normalization convention Tr L°L? = 257°. Then we can 
write: 


u=( 4 nae @= 1,2, 20098 (18.17) 


where 4° are the usual Gell-Mann matrices for SU(3), and the 0 represents square 
and rectangular blocks that contain only zeros for entries. 

For the 9th and 10th generators, we use two Pauli matrices o!? in the 2 x 2 
block: 


0 ao}? 


0 O 
poo ( (18.18) 


The 11th and 12th generators are taken to be diagonal, with the diagonal entries 
given by: 


jigs diag (0, 0, 0, 1, —1) 


l 
LY = mie 22 28) (18.19) 


To define the next set of matrices, it will be convenient to define rectangular 
matrices A and B: 


1 O 0 O 0 0 
Ar=") 0 O f: Ap= lie Oma 0 0 (18.20) 
0 O 0 O 1 O 
and: 
Ome 0 O 0 0 
Br=1..0 Olea Baie0. Lees = 10 0 (18.21) 
0 O 0 O 0 1 


Then the 13th through the 24th generators are given by (for k = 1,2,3): 


1507 eae 0 Ak 
Al 0 
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p'416.18  _ ra 0 bAg 


-iAT 0 
392,23 py 1742k oo 
| in| 
[20.22.24 = 1842k — ( oe (18.22) 
k 


Now let us identify the charge operator from this representation. If we analyze 
the 5 representation of the fermions, their charge assignment is given by: 


o- 5(L + /5/3L"”) (18.23) 
Explicitly, the charge matrix is: 
Q = diag ( — 1/3, —1/3, —1/3, 1, 0) (18.24) 


The quarks have fractional 1/3 charge compared to the electron. This is 
extremely important, because the charge assignments of the various quarks and 
leptons are now quantized; that is, GUT theory gives us charge quantization. This 
is different from the usual U(1) Maxwell theory, where the charge e is a continuous 
parameter. Because the charge operator is now one of the generators of the group, 
its eigenvalues are quantized and we have a definite quantized charge assignment 
for the quarks and leptons. In fact, the quarks have 1/3 charge relative to the 
leptons just because there are three colors within SU (3). 

Similarly, the charge assignment of the 10 representation can be computed. 
The charge operator Q, acting on the mixed tensor 10 = vi , yields: 


QW!) = 0; — Q; (18.25) 


From this, we can read off the charge assignments of the 10, which are also 
experimentally correct. The relative ease with which we can generate the correct 
quantized quark and lepton assignments is one of the successes of the GUT 
approach. 

We can now make the precise association between the vector mesons of SU(5) 
and the vector mesons of the electroweak model (Wi, W2, B,,) and the gluons 
Gi, of QCD: We find: 
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li 3 
Ay s Wie 
A? — B, (18.26) 


The new vector mesons, which are only specific to SU(5) and not the Standard 
Model, are given the names: 


13, 14,...,18 i 
Au > X, 


19,20,...,24 j 
a a (18.27) 


Putting everything together, the S x 5 matrix A, is given by: 
a 
Ap=V2 ) AgL (18.28) 
i=l 


where this can be rewritten as: 


Ae ‘a 18.29 
ae ome ase 
where: 
», mmm 
Pe s s 
A Co xX? fy? 
Vv 30 ye Pp 
W. 3B 
Poe X,; Xz X3 a Vat Fe w* 
i a _ a —w? , 3B 
Yi Y Y3 W V0 + 30 


Next, we would like to calculate sin? @y. This is now easily accomplished 
by extracting out the coupling constant for SU(2) and U(1) from the coupling of 
gauge bosons. 

The covariant derivative associated with W; and B, can be related to the 
coupling of A,, and Z,, by extracting out the covariant derivative of the 11th and 
12th generators: 


D;, d, — i(g/2) (W2L" + B,L") 
= 8, —i(g/2)| Ay (sin@wL"' +cos Oy L”) 


+ Z,, (cos 6wL" — sindwL") | 


Ill 


dn — i(eQAy + gQzZ,) (18.30) 
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Since Q = (1/2)[L"! +,/5/3L""], we can take the ratio of 1 to \/5/3, which equals 
the ratio of sin @w and cos 6y. Thus, we have: 


Tae: (18.31) 


This prediction, by itself, would be a disappointment, since experimentally 
the measured value is sin?@y ~ 0.23. However, we must consider this to be 
a first-order value of the Weinberg angle @y that must be corrected by radiative 
corrections coming from the renormalization group. 

Earlier, we analyzed how the renormalization group arguments give us a handle 
on the size of the GUT scale. Let us re-examine this calculation with SU(5) in 
mind. For coupling constants g;, 22, 23 for the groups within the Standard Model, 
we have the following solutions to the renormalization group equations: 


2 
gi (u) a a() 

4n 0) as bw 

g3(u) as) 

4n —SS—stsésS iN. 

2 

a = stu) (18.32) 


where a(;z) describes the electromagnetic coupling. 
By taking linear combinations of these equations, and using the fact that 
gi(Mx) are all equal, we find: 


a iL 8 


and: 
sin? Ow ~ 3/8 — (55/247 )a() log(My / 2) (18.34) 
A careful analysis of the parameters of the theory gives us: 


My ~ 4x 10'*GeV 
sin’ Oy ~ 0.20610016 (18.35) 


while the experimentally observed value is sin? @y ~ 0.2325 + 0.0008. 
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18.5 Spontaneous Breaking of SU (5) 


Next, we wish to use the Higgs mechanism to break down the theory to the Standard 
Model and eventually down to SU(3),. © U(l)em The first breaking down to the 
Standard Model is accomplished by taking a Higgs boson transforming in the 24 
adjoint representation of the group. The second breaking down to SU(3) ® U(1) 
is achieved via a 5 of Higgs: 


24: SU(5) — SU(3) ® SU(2) ® U(1) 
= SU (3) @ SU(2) ® U(1) — SU(3) ® UC) (18.36) 
If we describe the breaking of the Lie algebra via the adjoint representation 


labeled by (L),. = 2°, then the matrices would then be 24 x 24, which is 
unwieldy. To make things simpler, we note that: 


5@5=2401 (18.37) 


which allows us to write down the Higgs as ®°, where a represents the 5 index 
and b the § index. We then take the 24 Higgs to be represented by the product: 


2) = b.8? — =h.$°8 (18.38) 


where we have subtracted out the 1. 

We can, of course, reassemble these mesons in terms of the 24 members of the 
adjoint representation of SU(5). In the adjoint representation, the Higgs meson 
becomes: ® = ©°L7/2 for a = 1,2,...,24. The kinetic term for the Higgs 
potential is given by: 


= at a 
Lo= 5 De D,,®' D“® (18.39) 
where: 
; als 
D,,® = 0, — ig Ee =a | (18.40) 


We now choose a potential V(®) such that the minimum is given by: 


(6) = vdiag (1, 1, 1, 3/2, —3/2) (18.41) 
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This is equal to the unit matrix in the SU(3) subspace and equal to the unit 
matrix times —3/2 in the SU(2) subspace. The only generators that have nonzero 


and Y vector bosons get mass, while preserving the massless nature of the vector 
bosons for SU(3) ® SU(2) ® U(1). Thus, this successfully breaks $U(5) down 
to SU(3) ® SU(2) @ U(1). 

The mass matrix for the vector bosons is given by: 


; g’ Tr[A,,, (®)?] = m2,A4 A? (18.42) 
Then we get: 
—p°v (18.43) 


To arrive at this assignment of quantum numbers, we will take the Higgs potential 
to be: 


— pw? Tr(®) + 7 (Tr&?)” + 5b Tr (®*) (18.44) 


We can now shift the minimum and get an expression for v in terms of the 
Higgs potential parameters: 


15 
p= sav + on? (18.45) 


The second stage of Higgs breaking is mediated by a 5 Higgs boson, trans- 
forming as a doublet under SU (2). From our discussion of the electroweak model, 
we know that a suitable choice for the Higgs meson is given by: 


h} 


H= h3 = (3, 1) 6 (1, 2) (18.46) 


The potential for H is the same as for the electroweak model: 
Da 2\2 
VE ou || tae (|H|*) (18.47) 


As usual, the breaking of SU(2) is performed by having an expectation value 
along the h° direction: 


(H) = (—h°) = up (18.48) 
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where v? = Av}. Then, as expected, we will have: 
i 
M?, = M3 cos? Ow = i gin (18.49) 


Finally, we look at the fermion masses of the theory. They are generated via 
the generic coupling yw, where ® acquires a vacuum expectation value. 

Since fermion masses appear via the coupling of two fermions to the Higgs 
boson, we expect them to arise from interactions involving the combination (5 @ 
10) @ (5 @ 10). This decomposition is given by: 


5210 = 5@45 
10210 = 5945650 
5@5 = 10015 (18.50) 


Fermion masses are thus generated by the Higgs in the following representa- 
tions: 5, 10, 15, 45, 50. Since the 5 Higgs was used in the minimal SU(5) model 
to break the electroweak interactions, no new Higgs need be added to the minimal 
theory. Nonminimal SU(5), involving more parameters, can be constructed using 
the 10, 15, 45, 50, which appear in the tensor product decomposition, meaning 
that these Higgs can couple to two fermions and generate masses. 

The nice feature of this construction, however, is that 24 does not appear in the 
tensor product decomposition. Thus, the ®24 Higgs meson does not couple to two 
fermions, and hence the fermion masses cannot be of the GUT scale My. This is 
gratifying, since we do not want any of the quarks and leptons to have GUT scale 
masses. Notice that the 5 does appear in the tensor product decomposition, which 
means that the fermion masses can be of the order of My. 

Although minimal SU(5) seems to unify the strong and electroweak interac- 
tions in a surprisingly tight fashion, we should point out that the current exper- 
imental limits on proton decay seem to rule it out. For example, the theoretical 
decay rate of the proton into an electron can proceed via: 


Pep — eo )=45 107"! yr (18.51) 
which is much too fast. Experiment has pushed the proton lifetime to: 
T-'(p + et +2°), T-'n 3 e* +27) > 6 x 10° yr (18.52) 


Furthermore, electroweak measurement of the Z mass are apparently precise 
enough to cast doubt on the value of sin? @y ~ 0.206, predicted by the minimal 
model [see Eq. (18.35)]. This, of course, does not rule out more complicated, 
nonminimal §U(5) and models with more complicated GUT groups and couplings. 
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18.6 Hierarchy Problem 


The most important theoretical challenge facing GUT theory is the hierarchy 
problem.‘ The origin of this problem is easy to isolate. The SU(5) theory, for 
example, has two Higgs bosons, which introduce the mass scales My and My. 
Because of the vast difference between these two mass scales, it is important to 
keep the two scales apart, so there is no mixing between them. We must, therefore, 
“tune” our two mass scales so that we preserve the vanishingly small ratio between 
them: 
Mw 


aie 18.53 
My 10 ( ) 


However, it easy to show that the loop corrections lead to interactions connect- 
ing two ® fields with two H fields. This, in turn, means that we must introduce a 
term in the action that corresponds to this new graph: 


V(®, H)=a|H/ TrS? + BAGH (18.54) 


But introducing ©? H?-type terms into the Higgs potential has mixed these two 

mass scales. Thus, the vast ratio between these two mass scales has been destroyed. 
We can, at this one-loop order, retune the parameters within the Higgs potential 

so that we once again re-establish the hierarchy. The explicit calculation yields: 


v? — (15a + <b)? ~ 107% y? (18.55) 


Although we can now tune a and to one part in 104, we will find that the two- 
loop result reintroduces mixing between these two mass scales, and the hierarchy 
is again ruined. 

We can always retune our parameters at the two-loop level, but then this 
retuning will not survive at the third-loop order. Clearly, we have a problem. It is 
difficult to imagine a more clumsy way in which to unify the known interactions 
than continually to perform a retuning of parameters to incredible accuracy at each 
order in perturbation theory. 


18.7. SO(10) 


Let us leave the minimal SU(5) model and go to the next model, $O(10),5-6 
which incorporates many of the attractive “accidents” of the SU(5) model and 
explains their origin group theoretically. In general, the series § O(N) is attractive 
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because of anomaly cancellation. We know from Eq. (2.68) that the generators of 
SO(N) can be written in terms of the matrix (M“).,, which is antisymmetric in 
a, band i, j. If we insert this value into the anomaly: 


Tr [{M¥, M!\M"") (18.56) 


we find that this number cannot, in general, be written as a constant tensor with the 
indices /, j,k,/,m, and n (except for N = 6, where a constant tensor with all the 
symmetry properties is given by ¢«//*’""). Thus, all SO(N) theories are anomaly 
free except for N = 6. 

Furthermore, we are interested in complex representations of the Lie group. 
However, not all SO(N) groups have complex representations. In particular, the 
requirement of complex representations restricts us to the groups SO(4n + 2). 
Therefore, the smallest orthogonal group of rank > 4, with complex representa- 
tions, is given by SO(10). 

The representation in which we are interested is the 16-dimensional spinor 
representation of SO(10). This is because the adjoint representation has 45 
elements, which is too many, while the vector representation has too few, only 10 
elements. 

S O(10) includes $U(5) as a subgroup, therefore, all representations of SO(10) 
can be described by giving its SU(S) quantum numbers. The essential feature of 
the 16 is that, under SU(5), it transforms as: 


16=5061001 (18.57) 


where the 1 refers to the right-handed neutrino, which is missing in the minimal 
SU(5) scheme. In one stroke, we see why the 5 © 10 representation worked so 
well, and this is because they are actually part of the 16 representation of SO(10). 

The group SO(10) has 45 generators, which can be broken down under SU (5) 
as follows: 


45 =24010610010 (18.58) 


where the 24 is the same multiplet of gauge bosons that we encountered earlier 
for SU(5). 

Symmetry breaking can proceed in numerous ways because of the large num- 
ber of subgroups contained within SO(10). The simplest route to symmetry 
breaking is given by: 


SO(10) — SU(5) > SUB) ® SU(2) & UA) — SU(3) @ U1) (18.59) 
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The first breaking down to SU(5) can be mediated with a 16 of Higgs bosons. 
The second breaking down to the Standard Model needs a Higgs in the 45 repre- 
sentation. And the last breaking is accomplished via 10 Higgs: 


16: SO (10) — SU(S5) 
45: SU(5) — SU(3) @ SU(2) ® U(1) 
10: SU(3) ® SU(2) ® U1) — SU(3) @ U(1) (18.60) 


Yet another favored route is given by: 


SO(10) — SU(4)@ SU(2), ® SU(2)p 
> $SU(3)® SU(2), ®@ SU(2)r ® U)a-1 
— SU(3)@ SU(2)® UG) > SUB)@U) (18.61) 


This sequence of breakings is initiated by the following representations of Higgs 
bosons: 54, 45, 16, 10: 


54: SO(10) > SU(4) ® SU(2), ® SU(2)p 
45: SU(4) ®@ SU(2), ® SU(2)r 
— SU(3) @ SU(2), @ SU(2)r ® UC) B_1 
16: SU(3) ® SU(2); @ SU(2)r @ U(1)g-1 
— SU(3) @ SU(2) @ U(1) 
10: SU(3) ® SU(2) ® U(1) — SU(3) ® UC) (18.62) 


Likewise, the fermion masses can also be analyzed. The fermion masses are 
generated by two fermion fields; therefore, we find: 


16 © 16 = 10 6 126 & 120 (18.63) 


Thus, fermion masses can only be generated through Yukawa couplings via the 
10, 126, or 120 representations for the Higgs particle. To see the relationship with 
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the SU(S) model, we can decompose these representations as: 


10 = 505 
126 = 1905010615045 050 
1200 = 5650100100450 45 (18.64) 


To perform calculations with SO(N) models, it is necessary to have a specific 
representation of the spinors. We can use a simple recursive technique, generating 
the spinor representation of § O(2n +2) from the spinor representation of S$ O(2n). 

Let mie for? = 1,...,2n form a Clifford algebra for SO(2n). From these 
elements, we can generate the Clifford algebra for SO(2n + 2), with elements 


—— fori =1,...,2n+2. The recursion relation is: 

rene 

Eo = ; (n) ; a il, 2 est} 2n 

Ve 
| 
(n+]1) _ 

c= (5 5) 
Oni 

pe = ( a (18.65) 


For $O(2n + 1), its Clifford algebra is formed from [("*” for i = 1,...,2n +1, 
that is, by omitting the last element of the Clifford algebra. 
Then the generators of the group, in terms of the spinors, are given by: 


M® = ae T?] (18.66) 


For models beyond SO(10), it is useful to write down generic values of the 
predictions of the various theories. For example, we can compute sin? Ow with the 
simple observation that, before symmetry breaking, all couplings of the various 
subgroups of the gauge group are equal. Therefore, we know that g?Tr(T?) for 
the various subgroups is the same. Setting them equal, and solving for the ratio 
of the coupling constants, we can show that: 


2 -) 
2 _ é = 3 
sim Ow = FTO?) 
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den. THEO?) 
Cen Irie 


(18.67) 


where c refers to the SU(3) color subgroup. 

There are, of course, many GUT models beyond the SO(10) theory. Let us 
list a few of the restrictions on these models. First, we must have complex repre- 
sentations (unless we want to have “mirror” fermions with opposite handedness to 
the usual ones, but then we have another problem of explaining why their masses 
are so heavy and hence are not seen). The only complex representations occur for 


VAS CN ) for => 2 
Dis 
3. SO(AN +2). 


Second, we want anomaly cancellations, which must be checked by hand. For 
example, SO(N) is anomaly free if N + 6. 

Of these, the exceptional groups’ look attractive, since there are only a finite 
number of them and not an infinite series. Of the exceptional groups, F¢ is 
attractive since it has complex representations and can be broken in many ways, 
including: 


Eo — SO) > sU0G) 
— SU(3), ® SU(2) ® U(1) (18.68) 


The fermions can be accommodated in the 27 of Es. E7 can accomodate the 
fermions in a 56 representation (except for the t quark), but has an unacceptable 
structure for the weak currents. Eg has also been studied. Its lowest dimension 
representation is the adjoint with 248. 

One interesting sequence of breakings is given by: 


Eg — SO(16) + SO(10) @ SO(6) — SU(S) x SU(3) (18.69) 


Eg and E¢ have also been seriously examined from the viewpoint of the su- 
perstring, where they are some of the preferred intermediate steps in symmetry 
breaking. 
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18.8 Beyond GUT 


The GUT theories, although they have many compelling features, are a speculative 
step beyond the Standard Model. There are other speculative steps. We mention 
just a few of them. 


18.8.1 Technicolor 


Technicolor® is based on the philosophy that fundamental scalars are unattractive 
and undesirable features of an electroweak model. They can be eliminated if they 
emerge as bound states of some fermions. In order to drive the symmetry breaking 
to give the W boson a mass of around 80 GeV, we must postulate a new color-type 
interactions called hypercolor or technicolor, which becomes strong at | TeV. 

One advantage of this approach is that the hierarchy problems seems to be 
avoided. The hierarchy problem emerged when mixing between fundamental 
scalars created radiative corrections that forced us to retune the parameters to 
preserve the mass difference between ordinary energies and GUT energies. In 
the technicolor picture, there are no such scalar couplings, because there are no 
scalars. 

Unfortunately, the simplest versions of technicolor have been ruled out be- 
cause they have problems with flavor-changing neutral currents. A possibility 
exists of avoiding this by proliferating technifamilies, but then these theories also 
have problems with the various counterparts of the Nambu—Goldstone boson, the 
technipions, which have not been discovered. 


18.8.2. Preons or Subquarks 


Everytime we have probed deeper into the structure of matter, we have seen a new 
layer of constituents, from molecules, to atoms, to nuclei, to subatomic particles, 
and to the quarks. It may not be such a leap of logic, therefore, to suppose that 
the quarks themselves are composite objects. 

Several problems face subquark theories. First, there are technical ones, such 
as eliminating anomalies via the ’t Hooft anomaly matching conditions. But there 
are also more physical questions, such as the lack of any guiding principle by which 
to construct subquark theories. Nature gives us few signals as to which direction 
to take in generating subquark models, of which there are many. One criterion 
is that these subquark theories must explain why the electron and neutrino seem 
point-like with a small or zero mass, even though the energy of their constituents 
is quite large by comparison. Naively, one would expect that, if the electron were 
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a composite particle, its mass would be comparable to the energy scale of the 
composite particles, which would be enormous. 


18.8.3 Supersymmetry and Superstrings 


Nature, however, does give us very strong signals, at least in one direction, and 
this is the existence of gravitation. There is no question that gravitation exists, and 
that it is the basic glue holding much of the universe together. Ironically, although 
gravity was the very first force to have its fundamental classical equations revealed 
with the work of Newton three centuries ago, it resists unification with the other 
forces for a very fundamental reason: It is not renormalizable. Gravitation is a 
gauge theory of great sophistication, requiring new ideas in order to marry it to 
the other three fundamental interactions. 

We now turn to the theory of general relativity and to the two theories that 
give us the only known nontrivial extensions of Einstein’s theory: supergravity 
and superstrings. Not only does supersymmetry give us a plausible solution to the 
hierarchy problem, it also gives us theories of gravity in which the divergences 
are partially or even completely cancelled. 

No one knows what will be the ultimate outcome to the vigorous theoretical 
pursuit of a quantum theory of gravity. However, from the standpoint of quantum 
field theory, it has already given us an incredibly rich laboratory by which to test 
old ideas and generate entirely new ones. 

In summary, GUT theories give us the first nontrivial extension of the Standard 
Model. GUT theories based on gauge groups such as SU(5), O(10), and E¢ have 
the advantage that they are elegant and have fewer coupling constants than the 
Standard Model. Although the unification of the various forces takes place at 
approximately 10!° GeV, GUT theories can still be tested if the proton decays. 
Minimal SU(5) has now been ruled out experimentally, but theories with more 
complicated groups and couplings are still consistent with experiments. 

We now turn out attention to the problems raised by the GUT theories, such as 
the presence of gravity, the hierarchy problem, and the renormalization of quantum 
gravity. 


18.9 Exercises 


1. Show that, if we simply drop the massive X and Y mesons in the action for 
SU(5) GUT theory, that the resulting theory (for the fermions and vector 
mesons) becomes essentially the Standard Model action with gauge group 
SU(3) ® SU(2) @ U()). 
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2. Take the spinor representation of SO(10). Explicitly extract the generators of 
SU(5) trom the generators of $O(10). Explicitly decompose this spinor into 
the SU(5) representations in terms of quarks and leptons. In this way, show 
how the SO(10) model decomposes into the Standard Model (for the leptons, 
quarks, and vector mesons only). 


3. In minimal SU(5), construct the explicit expression for the coupling of Higgs 
bosons to fermions to form the Yukawa potential. Show explicitly how the 
24 representation of the Higgs can break this down to the Standard Model. 


4. Isolate which graphs would contribute to proton decay in the minimal SU(5) 
model. By dimensional arguments, do a quick order of magnitude calculation 
of the decay rate. 


5. Let [; generate a Clifford algebra. Define: 
1 ; 
aj = > (Px-1 ra iT’2;) (18.70) 


Show that a; and a! form a set of anticommuting annihilation and creation 
operators; that is: 


{a;, at} = 6); (18.71) 


with all other anticommutators being zero. 


6. Let tr? be a set of traceless Hermitian n x n matrices, which generate the 
algebra of SU(n). Show that T°, defined by: 


T* = ) al (c*) , a (18.72) 
ik 
also generates the algebra of SU(n). Now show that any bilinear a! a, can be 
expressed as a combination of generators M;; of the group S$ O(2n): 
t =.= i M>; es I M>; 
ajay = 5 jk + 3 DN pret 3 2j—1,2k 
1 i 
—— Meri + =~ Moj.2% (18.73) 
Z 2 
Show that this proves that: 
SU(n) C SO(2n) (18.74) 


This shows one way in which to embed SU(5) GUT into $O(10) GUT. 
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7. Let the a} be five anticommuting creation operators acting on a vacuum |0). 
For 5O(10), show that a 32 dimensional spinor |y) can be decomposed as: 


1 m 
lv) = |0 0) Wo + a; 0) vi + = salallO) Vin + = 12° ella ajaja a}, 0) 0 ji 
+5,etmalalah a} |0)y; tatalatalal \0) wo (18.75) 


where yj; is antisymmetric. Show that these form the SU (5) representations 
of 1, 5, 10, and their conjugates. Show that this generates the irreducible 
16-dimensional spinor and its conjugate. Now generalize your results for 
SU(n) and SO(2n), decomposing a S$ O(2n) spinor into SU(n) multiplets. 


8. The breakdown of SO(10) down to SU(5) leaves us with an extra U(1) 
symmetry. Show explicitly how this extra quantum number can be associated 
with B — L, where B is the baryon number and L is the lepton number. 
Explicitly construct the operator which generates B — L. 


9. In Eq. (18.60), there are several ways in which SO(10) may be broken 
down, with various representations of Higgs. Construct explicitly the Higgs 
potential for each of these breaking mechanisms. Analyze their strengths and 
weaknesses. 


10. In a model with Eg symmetry, we have Eg, D> SO(10) ® U(1). Thus, the 27 
and adjoint 78 of E4 can be broken down into: 


27 
78 


1601001 
456101660 16 (18.76) 


Rewrite this decomposition strictly in terms of SU(5) representations. From 
this, describe the physical quark/lepton content of the 27 and the vector mesons 
of 78. How many new particles must be postulated? 


11. A Weyl neutrino cannot have a mass, since the mass term couples left- and 
hand-handed fermion fields. However, consider a theory in which the neutrino 
is a Majorana fermion, which obeys w = w°. Then it is possible to construct 
a mass term for this field: 


Wevi =(WRCWL=WLCWL * (18.77) 


Notice that this Majorana mass term is now defined totally in terms of w,. 
Show that this quantity is Lorentz invariant. Show that, in contrast to the 
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Weyl neutrino action [which is invariant under v — e‘’v, which generates a 
U(1) symmetry, or lepton number], the Majorana mass term violates lepton 
number by two units. 


12. Show that a Majorana neutrino cannot be generated in the Standard Model. 
(Hint: show that a term like vi Cu transforms like / = 1, J; = 1, and see if 
such a term can be generated by the Higgs mechanism.) 


13. Show that a Majorana neutrino mass cannot be generated in a minima] SU(5) 
theory, even though there is a / = | Higgs field. Can it be generated, if the 
neutrino is a SU(5) singlet? What about $O(10)? 


14. Prove that the spinor matrices of O(N) presented in Eq. (18.65) do, in fact, 
satisfy the correct Dirac algebra. 


15.. Prove Eq. (13:67). 


Chapter 19 
Quantum Gravity 


I was sitting in a chair in the patent office at Bern when all of a sudden a 
thought occurred to me: “If a person falls freely he will not feel his own 
weight.” I was startled. This simple thought made a deep impression on 
me. It impelled me toward a theory of gravitation. —A. Einstein 


19.1. Equivalence Principle 


One of the great physical problems of this century is to unify general relativity 
and quantum mechanics. Together they can explain a vast storehouse of physical 
knowledge, from the subatomic realm to the large-scale structure of the universe. 
However, attempts to unify quantum mechanics with general relativity have all met 
with frustration. General relativity has a negative dimensional coupling constant 
(Newton’s constant) and hence is not renormalizable in the usual sense. To 
renormalize gravity, one must necessarily make a radical departure from quantum 
field theory as we know it. 

To see the origin of the problems with quantum gravity, let us first describe 
the classical theory of general relativity. General relativity, like special relativity 
before it, can be reduced to a few postulates.! 


Equivalence Principle: The laws of physics ina gravitational field are identical 
to those in a local accelerating frame. 


Einstein stumbled upon this deceptively simple principle and its consequences 
when he noticed that a person in a freely falling elevator would experience no 
apparent weight. He called this “the happiest thought of my life.” He generalized 
this to say that no physical experiment could differentiate a freely falling elevator 
from a frame without any gravity. In particular, it meant that in any gravitating 
system, one can at any point choose a new set of coordinates such that the 
gravitational field disappears. This new set of coordinates is the freely falling 
“elevator frame,” in which space appears locally to resemble ordinary Lorentzian 
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space. We wish, therefore, to construct a theory that is invariant under general 
coordinate transformations, that is, a theory in which one can choose coordinates 
such that the gravitational field vanishes locally. 

Following our discussion of gauge invariance, we will begin our discussion of 
general relativity by proceeding in three steps: 


1. First, we will write down the transformation properties of scalar, vector, and 
tensor fields under general coordinate transformations. 


2. Then we will construct covariant derivatives of these fields by introducing 
connection fields (Christoffel symbols). 


3. Finally, we will construct the action for general relativity and its coupling to 
matter fields. 


Since we need to express the physical consequences of the equivalence prin- 
ciple mathematically, one needs a mathematical language by which we can easily 
transform from one frame to another, that is, tensor calculus. We will define a gen- 
eral coordinate transformation as an arbitrary reparametrization of the coordinate 
system: 


= 0 0) (1941) 


Unlike Lorentz transformations, which are global space-time transformations, 
general coordinate transformations are local and hence much more difficult to 
incorporate into a theory. A general coordinate transformation therefore describes 
a distinct reparametrization at every point in space-time. (Historically, the local 
nature of general coordinate transformations was one of the original inspirations 
that led Yang and Mills to postulate local gauge theories.) 

Under reparametrizations, a scalar field transforms simply as follows: 


P(X) = $(x) (19.2) 


Vectors transform like dx" or d,,. Using ordinary calculus, we can construct 
two types of vectors under general coordinate transformations: covariant vectors, 
like 0,,, and contravariant vectors, like dx": 


ob _ ax 3 
Ox# = OH Ax” 

ax 
dx" = dx” : 
x 5x0 ax (19.3) 


(It is important to notice that x” is not a genuine tensor under general coordinate 
transformations. Not all fields with indices jz, v, ... are genuine tensors.) 
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Given these transformation laws, we can now give the abstract definition 
of covariant tensors, with lower indices, and contravariant tensors, with upper 
indices, depending on their transformation properties: 


zs ala?’ 
Au@) = SAsls) 
BY(Z) = ~~ BG) (19.4) 


Since we have arbitrary coordinate transformations, these vectors transform under 
GL(A4), that is, arbitrary real 4 x 4 matrices. 

Similarly, we can construct tensors of arbitrary rank or indices. They transform 
as the product of a series of first-rank tensors (vectors): 


ee Gr 
Dang = I] (i ) I] 


i=] j=l 


ax” ase 
(5) Vivien C2) (19.5) 


We can also construct an invariant under general coordinate transformations 
by contracting contravariant tensors with covariant ones: 


A), B’ = A, BY = invariant (19.6) 


We now introduce a metric tensor g,,, that allows us to calculate distances on 
our space. The infinitesimal invariant distance between two points separated by 
dx* is given by: 


ds* =dx"8,,,dx" (19.7) 


If g,, is defined to be a second-rank covariant tensor, then this distance ds? is a 
genuine invariant. 
The metric tensor transforms as: 


K\ (ax 
Bav(x) = (3 ) (5) Suv(X) (19.8) 


ax” } \ ax® 


From this, we can deduce how the metric tensor changes under an infinitesimal 
general coordinate transformation 6x” = €*. By expanding the previous transfor- 
mation rule, we find that the variation of the metric tensor under an infinitesimal 
coordinate reparametrization is given as follows: 


b2uv = Die” Boy + OE" Bue zs CoO ney (19.9) 


One essential point is that it is always possible to find a local coordinate 
system in which we can diagonalize the metric tensor, so that g,,, becomes the 
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usual Lorentzian metric at a point. Then the space becomes “flat” at that point. 
(We emphasize that it is impossible to gauge an arbitrary metric tensor so that 
the space is flat at all points in space.) This is the mathematical expression of 
Einstein’s original observation, that one should always be able to jump into the 
“elevator” frame at any single point in space-time, such that things look locally 
flat. 

Now that we have defined how scalar, vector, and tensor fields transform under 
reparametrizations, the next step is to write down derivatives of these fields that 
are also covariant. The derivative of a scalar field is a genuine tensor under general 
coordinate transformations: 


= p(x) (19.10) 


However, as in the case of gauge transformations, we find that the derivative of 
a vector is not a genuine tensor under general coordinate transformations. Under 
this transformation, the derivative can act on the factor 6,x”, spoiling general 
covariance. The solution to this problem, as we know from gauge theory, is to 
introduce new fields, called connections, that absorb these unwanted terms. The 
connection field for general relativity is called the Christoffel symbol Ce We 
introduce the symbol V,,, which is a covariant derivative: 


V Ay = 0, Aye AD 


0,A” — 1), A* (19.11) 


We will define the transformation properties of the connection such that the 
derivative of a vector becomes a genuine tensor, paralleling the situation in gauge 
theory: 


ax* ax? 
(ViAv) = (= ad ) vA, (19.12) 


OX OX” 


Given this transformation law, we can, as in gauge theories, extract the trans- 
formation law for the Christoffel symbol: 


Mm = Ox ONO Oia COxeedn 
HY xt OX# OXY 9? = BxPdx® Ax Ox” 


iga3) 


We find that the Christoffel symbol is not a genuine tensor, but has an inhomoge- 
neous piece. [We recall that the gauge field A‘, also has an inhomogeneous piece 
in its transformation under SU(N).] 
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Covariant derivatives can be constructed for jncreasingly complicated tensors 
by simply adding more and more Christoffel symbols: 


VoAtia = OpAiin,.. 
+ d, Pe ee (1a) 


perm 


where we sum over all permutations of the indices. Notice that V,, depends on 
the tensor it acts on. More and more Christoffel symbols are required if it acts 
upon increasingly mixed tensors. 

At this stage, we have placed no restrictions on the connection, other than its 
transformation properties. The connection field, at this point, is an independent 
field. We would, however, like to construct a theory in which all fields, including 
the connection, are written in terms of the metric tensor. We thus need a constraint 
on the connection. From the equivalence principle, we know that we can always 
choose the “elevator frame” where the the metric tensor becomes the Lorentz 
metric; that is, the derivative of the metric tensor vanishes in this inertial frame. 
The covariant generalization of this statement is that the covariant derivative of 
the metric tensor in any frame vanishes: 


Vu8vr am Ou8var aa Dee a TM 8vp =0 (19.15) 


The number of independent equations in this constraint (4 x 10 = 40) is exactly 
equal to the number of independent components of the connection if we assume 
that the connection is symmetric in its lower indices. Thus, we can eliminate the 
connection field totally in terms of the metric tensor. To do this, we first write the 
equations in terms of the connection with only lower indices: Pyy¢ = 8o,T%.,. 

Now let us rewrite the vanishing of the covariant derivative of the metric tensor 
in terms of I’,,,.,. Written out explicitly (and cyclically rotating the indices), we 
find: 


Ou 2vr oF Dyv,a ap Dya,y = 0 
Oy 2ru ar Dyap ar Dyy,e = 0 
On 8uv ate Dip a0 Da = 0 (19.16) 


These three equations are identical. But by adding the first two equations and 
subtracting the last (and remembering that the Christoffel symbol is symmetric in 
the lower indices), we then find: 


] 
Dave = — 5 On8v0 + O80 = o Suv) (19.17) 
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19.2 Generally Covariant Action 


Now that we have defined the transformation properties of the fields and con- 
structed covariant derivatives, the last step is to write down the action for general 
relativity and couple it to other fields. To construct the action, we will need to 
take the commutator between two covariant derivatives. In flat space, this com- 
mutator vanishes. However, for general coordinate transformations, we find that 
this commutator does not vanish. By explicit construction, we find: 


(Vi Wald, Sek de 


R? a1? — 0,1, — Tea + Ve, = (19.18) 


fevr 


We call R”,,, the Riemann curvature tensor. (Alternatively, we could have derived 
the curvature tensor by taking a vector A, and then moving it around a closed 
circle using parallel transport. After completing the circuit, the vector has rotated 
by the amount R”,,A,A*”, where A“” is the area tensor of the closed path.) 
From this, we can see the close analogy between the elements of gauge theory and 
general relativity. This close correspondence can be symbolically represented as 
follows: 


a 
a 
Dp > Vu 


i ee (19.19) 


By suitably contracting the indices in the curvature tensor, we can reduce it to 
tensors of smaller rank. Contracting p and v gives us a second-rank curvature 
tensor: 


Rua = R? 


vd 


8 (19.20) 


This is called the Ricci curvature tensor. 
Finally, we can construct a genuine invariant by contracting all the indices: 


Royo = R (19.21) 


Using ordinary calculus, we can also construct the transformation properties 
of the volume element: 


ox 
d*x = det | —— } d* 
mide (Fs )a x (19.22) 
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It is also easy to calculate the transformation properties of the determinant of 
the metric tensor g. Because det(A BC) = detA detB det C, we can easily show: 


ih 


03 
vV —8(X) = det (=) / —a(x) (19.23) 


An object that transforms like this is not a scalar in the usual sense. We call it a 
scalar density. 


The point is that now the product of these two is a genuine invariant: 
/—g d*x = invariant (19.24) 


From this we can construct actions. 
In order to write down an action, we wish to fulfill a few key conditions: 


1. The action must contain no more than two derivatives, or else there are ghosts 
in the theory that threaten unitarity. 


2. The action must be invariant under general coordinate transformations. 


Surprisingly, we find that there are only nwo solutions to these constraints, 
given by: 


S= ee d‘x./—eR (19.25) 
2K2 


(We can also add the cosmological term, which is proportional to A,\/—g, although 
experimentally A is very close to zero.) This is the celebrated Einstein—Hilbert 
action, which is the starting point for all calculations in general relativity. 

We can also calculate the equations of motion from this action. By making a 
small variation in the metric 5g,,,, we can compute: 


6g = ge Sau, 
1 v 
5/-g = 5 V~88uvd8" 
bRy = V dle, —V,sFe, (19.26) 


Taking the variation of the action, we then find the equations of motion: 
1 
Ruy = 7 SuR =0 (19.27) 


(The term 5R,,, does not contribute because it turns into a total derivative.) 
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In the presence of matter fields, we must alter this equation. We know that 
scalar matter couples to gravity via the interaction $,/—gg""d,.60,0 ~ 3!” Ty; 
therefore, the right-hand side of the previous equation should contain the energy— 
momentum tensor. 

One should mention that this equation reduces to the usual Newtonian potential 
equation in the limit that c — oo. In this limit, the metric tensor becomes the 
Lorentz metric, except for the term go0: 


g0-l1~¢@ (19.28) 


Then the @ field becomes the scalar potential, and Einstein’s equation reduces to 
Poisson’s equation: 


1 8k 
Ruy — x8wR =~ FZ Tw > VG = Arp (19.29) 


where p is the source term. From this, one can derive Newton’s original universal 
law of gravitation, that the gravitational force is proportional to the product of the 
masses and inversely proportional to the distance of separation squared. 


19.3 Vierbeins and Spinors in General Relativity 


The coupling of the gravitational field to other fields is also straightforward. The 
generally covariant action for scalar and Yang-Mills fields is given by: 


ae 


1 
ae (3° 3,8. a m*¢”) 


1 
af qv sara” eee 


(19.30) 


However, the coupling of gravity to spinor fields leads to an immediate dif- 
ficulty: There are no finite dimensional spinorial representations of GL(4). This 
prevents a naive incorporation of spinors into general relativity. There is, for- 
tunately, a trick that we may use to circumvent this problem. Although spinor 
representations do not exist for general covariance, there are, of course, spinorial 
representations of the Lorentz group. We utilize this fact and construct a flat 
tangent space at every point in the space. Imagine space-time as a rolling hill. 
Then the tangent space would correspond to placing a flat plane on each point of 
the hill. Spinors can then be defined at any point on the curved manifold only if 
they transform within the flat tangent space. 
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We will label the flat tangent space indices with Latin letters a, b, c, .. ., while 
tensors under general coordinate transformations are labeled by Greek letters: 
(ee te ee 

In order to marry the two sets of indices, we will introduce the vierbein or 
tetrad, which is a mixed tensor: 


Vierbein: — e%(x) (19.31) 
The inverse of this matrix is given by e. 


The vierbein can be viewed as the “square root” of the metric tensor via the 
following: 


Ges = Suv 
or = gle 
eee amo (19.32) 


Since the Lorentz group acts on the tangent space indices, we can define 
spinors on the tangent space. The Dirac matrices y* can now be contracted onto 
vierbeins: 


yee = (x) (19.33) 


It is easy to show that the commutator between two of these matrices yields the 
metric tensor: 


{yy } =e) (19.34) 


Our goal is to construct the generally covariant Dirac equation. We introduce a 
spinor w(x) that is defined to be a sca/ar under general coordinate transformations 
(and an ordinary spinor under flat tangent space Lorentz transformations): 


Coordinate transformations: ww — yp 


Lorentztransformations: yo — ef" ery (19.35) 


It is important to note that we have introduced local Lorentz transformations on 
the flat tangent space, so €gp is a function of space-time. 

This, of course, means that the derivative of a spinor is no longer a genuine 
tensor. As before, we must introduce a connection field we? that allows us to 
gauge the Lorentz group. The covariant derivative for gauging the Lorentz group 
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is therefore: 
Vit = Oy — ro "oan (19.36) 
The generally covariant Dirac equation is therefore given by: 
(iy*V, —m)y =0 (19.37) 


and hence the action for a Dirac particle interacting with gravity is given by: 
B= —saV=ER +ew(iy’V, —m)y (19.38) 


where e = dete, = /—8. . 

This new connection field gives us an alternative way to construct the Riemann 
curvature tensor. By taking the commutator of two covariant derivatives, we can 
construct a new version of the curvature tensor: 


[Vp Vol¥ = — LRio®y (19.39) 
Written out, this curvature tensor is generally covariant in jz, v, but flat in a, b: 


RP = dw? — awe? + wr wt? — ww? (19.40) 


At this point, the connection field > is still an independent field. We can 
eliminate it in favor of the vierbein by aerate an external constraint on the theory: 


Ver = dues +Tpvex + one, =0 (19.41) 


Again, the number of independent equations in the constraint (4 x 6 = 24) equals 
the number of independent components of the connection field, so we have elim- 
inated the connection field entirely as an independent field. 

The connection field can be calculated in much the same way as the I’, was 
calculated, by rotating the various indices and then adding and subtracting them. 
The result is: 


l 1 
Ore 507 ue — de?) + one (Boe, — Opes je, —(a>b) (19.42) 
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19.4 GUTs and Cosmology 


With this elementary introduction, we can now make qualitative statements con- 
cerning the impact of GUT theory on cosmology. Any study of the origins of the 
universe, of course, must be prefaced with a clear statement of assumptions and 
prejudices, since the origin of the universe is not a reproducible event and cannot 
be duplicated in the laboratory. 

However, general relativity has given us a theoretical and experimental frame- 
work in which to explain a large body of observational information. The scenario 
emerging from general relativity, that the universe started with a cataclysmic ex- 
plosion 10-20 billion years ago, is supported by three strong pieces of information: 


1. Red shift. The far-away stars and galaxies are receding from us, as measured 
by the Doppler shift. We do not see a blue shift in the heavens. General 
relativity is in agreement with Hubble’s law,? which states that the farther 
away a galaxy is, the faster it is moving away from us. Experimentally, this 
linear relationship between distance and velocity is summarized in Hubble’s 
constant, measured to be H ~ 15 km/sec per mega light year. 


2. Nucleosynthesis. The theory predicts that about a quarter of all hydrogen in 
the heavens should have fused into helium by the Big Bang. It also correctly 
predicts the abundance of many other elements. 


3. 3° Background radiation. The “echo” from the Big Bang, as predicted by 
Gamow,** should behave like blackbody radiation and should now have 
cooled down to the microwave range. The observed temperature of the 
background microwave radiation, measured by satellite to be 2.736 + 0.01°, 
fits well with Gamow’s original prediction. 


More specifically, the Big Bang can be viewed as a solution to Einstein’s 
equations in the presence of matter. 

Let us assume an ansatz for the metric tensor that solves Einstein’s equations, 
for example, the Robertson—Walker metric. We assume that the metric tensor 
is radially symmetric, with all angular dependence omitted, and that the time 
dependence of the metric is represented by a single function R(t), which sets the 
scale of the universe and acts as an effective “radius” of the universe. We assume 
the ansatz: 


ds* dG Gay. 


dt? - R40 (Ss = ad * a) (19.43) 
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where d{? is the usual solid angle differential, and k is a constant. Now let us 
assume a highly idealized model of the universe, consisting of a fluid of galactic 
clusters, with an average density p(t) and average internal pressure p(t). In this 
idealized frame, the energy-momentum tensor becomes: Tj = p, T/ = —p, and 
all other components are zero. 

Because all angular dependence has been explicitly eliminated, we find that, 
after inserting this ansatz into Einstein’s equations, these equations collapse into 
only two two equations for R(t): 


2 


R 8x k 
(z) “ae a 
R p A 
et WanGn (p F: = +> (19.44) 


We can always rescale R so that k is +1 (closed universe), 0 (flat universe), or —1 
(open universe). 

These two relativistic equations actually have a simplified Newtonian interpre- 
tation. Imagine a point particle at the surface of a sphere of radius R. The kinetic 
energy of the particle is } R?. Its potential energy is G(M/R) = (42 R°/3)p(G/R). 
Then conservation of energy states: 


3 
a Ea S (= °z)| 24 (19.45) 


dt | 2 3 R 


which yields our first relativistic equation. 

The second relativistic equation can also be seen as the conservation of energy. 
Imagine a sphere of radius R filled with a fluid, such that the conservation of energy 
yields dU = —p dV for the kinetic theory of gases. Then this becomes: 


d (4nR*p\___d (4nR3 Rave 
d\o3 ) — oplees Spe 


If we assume the cosmological constant is zero, A = 0, we can eliminate p 
and obtain one equation: 


2RR+R?+k=0 (19.47) 


We assume that k = 0. For sufficiently large times, we find that the radius of the 
universe expands in time as a power law: 


9 1/3 
R= (sem) p23 (19.48) 
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This power law describes the expanding Friedman universe,5 and was one of 
the first cosmological models to be found from Einstein’s equations. This model, 
in turn, can explain the three experimental features of the Big Bang. 

These general features, however, do not go far enough in terms of explaining 
precisely how the universe cooled down since the initial Big Bang. The general 
consensus is that the theory of elementary particles will ultimately play a decisive 
role in this respect. From the point of view of GUT theory, the Big Bang can be 
studied via a series of stages in the cooling of the universe. The boundary be- 
tween each stage corresponds to the energy scale at which spontaneous symmetry 
breaking occurred. 

A rough sequence of events, supported by the general features of any GUT 
theory, is as follows: 


1. 10-* sec. At the Planck energy 10!? GeV, all the symmetries of gauge 
theory were supposedly united into a single force. Gravitational effects were 
strong and, in fact, were unified with the GUT forces. 


2. 10-*6 sec. At the energy scale My = 10! GeV, the GUT gauge group broke 
apart into SU(3) ® SU(2) ® U(1) of the Standard Model. 


3. 107!° sec. At 10? GeV, the electro-weak symmetry SU(2) @ U(1) broke 
down into U(1)em. 


4. 10~° sec. At 1 GeV, the quarks bound together to form hadrons. Shortly 
thereafter, nuclei slowly began to condense without being torn apart. 


5. 10'* sec. At 107? GeV, atoms condensed without being ionized. Photons 
could now move through space without being easily absorbed, so space be- 
came black. Before this, space was full of ionized plasma and hence was 
opaque to light. 


6. 10!© sec. Galaxies began to condense about | billion years after the Big 
Bang. 


7. 10'7 sec. The present day era, about 10 to 20 billion years after the Big 
Bang. 


Given this rough sequence of events for the beginning of the universe, we can 
begin to ask what implications this has for the GUT theories. We find that the 
GUT theories give us a clue to the solution to two long-standing cosmological 
problems: the matter-antimatter asymmetry in the universe, and the flatness— 
horizon problems. 

Itis a fairly well established fact that our visible universe is composed primarily 
of matter, rather than antimatter. Although one may speculate that, at the Big Bang, 
there were equal quantities of matter and antimatter present, we find that our 
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universe is quite asymmetric. In fact, a rough estimate of the baryon—antibaryon 
asymmetry is that the number Nz of baryons dominates over the number of Nz 
of antibaryons by a factor of 10~°; that is: 


. Ne Np 


ee EE NPY 19.49 
Np +Ne ( ) 
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(In fact, this small asymmetry between matter and antimatter is probably the 
reason we exist in the first place to ponder this question.) 

Unfortunately, the Standard Model gives us no clue as to why this asymmetry 
between matter and antimatter exists. In the Standard Model, we must impose an 
initial asymmetry at t = 0. However, even if we put C and C P violating terms in 
the Standard Model at the origin of time, the C PT theorem can wash out baryon 
asymmetry. This is because, at equilibrium, baryons and antibaryons will have 
the same Boltzmann distribution because, by the CPT theorem, they must have 
the same mass. 

Thus, in order to explain baryon asymmetry, we must have two features: 


1. Breaking of C and CP symmetry’* and baryon number at the origin of time. 


2. A cosmological phase when these C and C P violating processes were out of 
equilibrium. 


Fortunately, GUT theories can accomodate both these desirable features. The 
first criterion can be satisfied by GUT theory in a number of ways. The second 
criterion is also satisfied if we analyze the cooling of the early universe. 

Assume that, at GUT times, there was an X particle that decayed into quarks 
and leptons and violated these symmetries. For very high temperatures, on the or- 
der of kT > My, the X particle existed in thermal equilibrium with other particles, 
and the decay of this particle could create a net baryon asymmetry. Normally, such 
a baryon asymmetry is cancelled by the inverse decays of the particles at equilib- 
rium, so the net baryon asymmetry does not survive. However, as the temperature 
of the universe decreased and kT < My, one can show, by examining decay rates 
and cross sections, that the X was no longer in thermal equilibrium, and any net 
baryon asymmetry was frozen permanently. The population of X particles and 
the number of inverse decays was suppressed by the Boltzmann factor: 


exp(—My/kT) (19.50) 


To be more specific, consider the following reaction rates: 


Ya TX + 19°); yy =T(X > aq) 


Y T(X > 1q); w=T(X > 4°q°) (19.51) 


= 
l 
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where the superscript c refers to the charge conjugated particle. The C PT theorem 
demands that these reaction rates obey the following relations: 


Ya + ¥b = Va + Yo (19.52) 
Furthermore, at the Born level, C PT enforces the following conditions: 


Ya=Yas Yo=V (19.53) 


However, beyond the Born term, higher-order interactions can destroy this relation. 
The presence of C and CP violating higher-order processes can produce the 
following relations: 


Ya ~ Va = Vo — % FO (19.54) 


without violating the C PT theorem. 

For example, in the minimal GUT theory, the first baryon asymmetric term 
enters in at the 10th level in perturbation theory. (Unfortunately, this is many 
orders of magnitude too small to explain the observed 10~? asymmetry. More 
complicated GUT theories, however, can obtain the observed asymmetry.) 


19.5 Inflation 


There are two puzzles that, within the framework of classical general relativity, 
cannot be solved: the flatness problem and the horizon problem. A plausible 
explanation for both, however, can be given if we add the effects of gauge theories 
to general relativity. 

The flatness problem arises because the universe appears much flatter than 
it should be. We know that there is a critical density p,, such that if p < p,, 
the gravitational pull of the matter in the universe is too weak to reverse the 
expansion, and the universe expands forever. For p > ,, the gravitational pull 
is strong enough to force the expansion to stop and eventually reverse itself. 
However, the density of the universe today seems to be fairly close to p ~ p, = 
3H?/8xG ~ 5 x 10~*° g/cm’. If we define: 


G@en8: (19.55) 


then we find that 9 ~ 0.1 — 10. 
Now assume that we extrapolate 2 backwards in time, so that we compute (2 
near the beginning of the universe. {2 rapidly becomes close to one as we go back 
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in time, meaning that Q was fine tuned in the early universe. For example, if we 
extrapolate back to the GUT universe, we find: 


Q=1+0(10-*) at T = 10° GeV (19.56) 


This means that, near the beginning of time, {2 was fine tuned to be | to one part 
in 10°°, which is difficult to believe. 

The horizon problem has a similar origin. In general relativity, the horizon 
refers to the farthest distance that we can see. If we look in distant parts of the 
heavens, we find that the universe is quite isotropic. In fact, the universe seems 
to be much more isotropic than it should be. In particular, the background 2.7° 
radiation appears to be very uniform, no matter where we look in outer space. 
But this is difficult to understand. For distant parts of the universe to be isotropic, 
they had to be in causal contact with each other in the distant past. Because of 
the limitation imposed by the speed of light, one can show that distant parts of 
the visible universe could not be in causal contact with each other. Hence, the 
universe should not be so isotropic. 

Although the classical theory of general relativity has difficulty explaining the 
flatness and horizon problem, one byproduct of gauge theory, Guth’s inflationary 
universe,?—"' has a plausible explanation for both. 

Whenever we have spontaneous symmetry breaking in the Higgs sector cou- 
pled to gravity, we generate a constant term, which corresponds to increasing 
the energy density of the vacuum. Normally, we throw this away. However in 
general relativity, this constant is multiplied by \/—g, so that it contributes to the 
cosmological constant. 

If we have a large cosmological constant A in the Einstein equations, then we 
must use what is called de Sitter’s solution. Like the standard Big Bang solution, 
the de Sitter solution is found by assuming spherical symmetry; so the metric is 
a function of the radius and time. Then Einstein’s equations reduce to a simple 
equation that can be solved with an exponential expansion, rather than a standard 
power law expansion. The de Sitter solution, with the cosmological constant, 
therefore yields an exponential expansion rather than a power law expansion: 


R(t) ~ e® (19.57) 

where: 
oa ( arp" 2 19.58 
~\3m2 Mp Wega 


where T. is the critical temperature at which inflation begins. This exponential ex- 
pansion, which naturally emerges whenever a symmetry is broken spontaneously, 
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might be large enough to solve the flatness and horizon problem if it were on the 
order of 10°°. 

The flatness problem may be solved because the visible universe that we can 
observe is only a tiny fraction of the total universe. Thus, our universe appears to 
be flat only because the radius of the universe is so large. 

The horizon problem may be solved because our present-day universe, ex- 
trapolated back in time, was only a tiny speck in the original primordial nucleus 
within which points were in thermal equilibrium. Thus, it is not surprising that 
distant points in today’s universe can have the same uniform temperature. Near 
the beginning of time, our universe was small enough so that all points could be 
in causal contact with other points. 

As attractive as the inflation theory is, only detailed experimental observations, 
for example, of the radiation left over from the early universe, will ultimately de- 
termine whether the inflation theory holds up with time. (There are, of course, 
problems with inflation. There is no unique way to introduce the potential nec- 
essary to yield an expansion of 10°°. There are several alternatives, but we often 
wind up reintroducing some form of fine-tuning back into the problem, which is 
undesirable.) 


19.6 Cosmological Constant Problem 


Although a naive application of GUT theory to cosmology seems to generate 
experimentally reasonable results, we should mention a serious problems with 
this (as well as any other) approach. This is the celebrated cosmological constant 
problem. Experimentally, we can measure the possible presence of the cosmologi- 
cal constant A by measuring exponential deviations from the standard R(t) ~ 177 
expansion. Experimentally, we find that it is consistent with zero to a remarkable 
degree: 


A < 107!°2 = 10-* GeV" (19.59) 


However, every time we break a symmetry spontaneously, we generate a vacuum 
energy proportional to: 


8 
Agut > (Fen vie) (19.60) 
3 $i=(¢:) 


Putting in the value of the SU(5) potential minimum with order M%., we find that 
Agut is 10! times too big. 
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In all of physics, nowhere do we find a greater divergence between theory 
and experiment than in the cosmological constant problem. The addition of new 
symmetries (such as supersymmetry, which we discuss in the next chapter) can 
reduce this discrepancy, but only down to about 10°°. 

The problem, at present, seems intractable. Even if we could somehow put 
A = 0 at early times (which is one byproduct of supersymmetry), we still have 
new contributions to the cosmological constant when we break supersymmetry 
and approach present-day energies. These, too, must be set to zero by a mechanism 
that is yet unknown. 


19.7 Kaluza—Klein Theory 


Perhaps the most theoretically clumsy feature of GUT theory is that general 
relativity is spliced onto the theory by brute force. Ideally, we would like to 
see gravitational interactions and GUT theory emerge from a higher unified field 
theory from geometrical or group theoretical arguments, rather than being put in by 
hand. The search for a more sophisticated theory embracing both gauge theory and 
general relativity has led to a re-examination of the old theory of Kaluza—Klein, 
which is perhaps one of the most ingenious extensions of the theory of gravity. 
Kaluza'? originally proposed uniting both Maxwell’s theory of electromagnetism 
and Einstein’s theory of general relativity by embedding both theories into a 
generally covariant five-dimensional space-time. When first proposed in 1919, 
the theory lacked an answer to the question: what happened to the fifth dimension? 
Seventy years later, we are still grappling with this question. 

Kaluza assumed that the fifth dimension was curled up into a tiny ring so 
small that it could not be experimentally observed by any instrument. Thus, 
although space-time may actually be five dimensional, experiments designed to 
determine the size of the fifth dimension would be too crude to detect this. Klein" 
then assumed that quantum corrections caused the fifth dimension to curl up. In 
quantum gravity, there is only one dimensionful parameter, which is the Planck 
length, or 10~* cm. Since this sets the scale for quantum gravity, it means that 
the fifth dimension might have curled up with approximately this radius, which is 
too small for any instrument to detect. 

Since the fifth spatial dimension is periodic, if we move in this direction, 
eventually one returns to the starting point. The fifth dimension has the topology 
of a circle: 


(x5) = (x5 + 277) (19.61) 
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where r is the radius of the fifth dimension. If we expand the field d(x) in this 
periodic space as: 


G(x) =) Gre'?* (19.62) 


we find that p = n/r and the momentum conjugate is quantized in terms of the 
integer n. These higher modes ¢, correspond to particles of mass 10!° GeV. To 
analyze the low energy limit of the theory, we can safely ignore these higher mass 
particles and take only the n = 0 mode of the power expansion. This means that 
$(x), in this approximation, loses all dependence on the fifth coordinate: 


dso(x) ~ 0 (19.63) 


This, in turn, allows us to decompose five-dimensional general relativity into its 
four-dimensional fields. 

Let A, B, C.... represent five-dimensional space-time indices. Let us define 
a new field, called A,, = gs,,. The metric tensor now decomposes as follows: 


2g ca | As KA, 
BAB = ( i i (19.64) 
KA @ 


Ejinstein’s action in five-dimensional space, with the four-dimensional fields sep- 
arated out, now reads: 


1 vo 
/ det gap oP Rag= /—detg a, (or"R,. — q Pur Fpo8"8 ) +--- (19.65) 


We have decomposed a five-dimensional theory into a four-dimensional theory, 
yielding the usual Maxwell theory coupled to general relativity. 

We can also see how Maxwell’s equations emerges by analyzing the gauge 
symmetry. The metric tensor g,,5 = A, transforms as follows: 


O8u5 = bAy = Ose, ae Ons a Onés (19.66) 


By taking the fifth coordinate sufficiently small, we retrieve the gauge variation 
of the Maxwell field: 6A, = 0,,A. 

Since the Maxwell field emerges as a byproduct of dimensional reduction, 
one should be able to derive a relationship between the electric charge, Newton’s 
constant, and the radius of the fifth coordinate. Consider, for example, the Dirac 
equation in a gravitational and electromagnetic field: 


F=Jl~s iy" (8, +ieAy) (19.67) 
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The coupling of the fermion to the vector potential is given by: 
ieAp wy yp (19.68) 


Now perform dimensional reduction on the fifth coordinate, using the fact 
that d; ~ 1/r, where r is the radius of the fifth coordinate. After dimensional 


reduction, we have: ‘ 
~vyue a” (19.69) 


Equating the coefficients, we then have: 
e~wk/r (19.70) 


For the electric charge to be ~ 1/137, this means that r is a bit larger than the 
Planck length. 


19.8 Generalization to Yang-Mills Theory 


The Kaluza—Klein method has a straightforward generalization to Yang—Mills 
theory. In fact, its first published announcement came as a homework problem in 
1963 at the Les Houches Summer School.'*:'> 

We now work in (4 + N)-dimensional space, which is decomposed as the 
product of flat Minkowski space M, and another N-dimensional manifold G. We 
will thus work with the space M4 ® G. We use A, B, C indices to represent this 
larger space; 4, v to represent the four-dimensional space; m,n to represent the 
N-dimensional space; and a, b to represent the adjoint representation of a gauge 
group. 

To distinguish the metric tensors in various spaces, we will define y4g to be 
the metric tensor in the larger 4+ N space. Let py, v be the indices describing 
four-dimensional space and let g,,, to be the metric tensor for this dimensional 
space. Correspondingly, let m,n be the indices describing the N-dimensional 
space, with metric Ym. Let x parametrize four-dimensional space, and let y 
parametrize N-dimensional space. 

To isolate the Yang-Mills field, let us now reparametrize the metric tensor. 
There are many ways to do this, but a convenient choice involves introducing a 
new field B/, which is a mixed tensor. We will choose the following: 


Suv + ¥mnBr BY mn Bi 
YAR (19.71) 
BY Yam Ymn 


where g,,, is only a function of x, and y,» is only a function of y. 


19.8. Generalization to Yang-Mills Theory 653 


By a direct calculation, we can show that the inverse metric is given by: 


gt — Bi ght 
n= (19.72) 
we, Big” VE a Be Bie? 


Our task is to now insert the value of the metric, parametrized in this way, into 
the Riemann curvature tensor defined over the larger (4 + N)-dimensional space. 
The calculation is straightforward, yielding: 


vy det 248 Rage’? = / det gy det Yon [ Rel) + Rwy) 


1 S x 
+ 5% nny) Fe FS, Cxg* x)g'P(x) +++ | 


(19.73) 


where R4(x) and Ry(y) are the respective four- and N-dimensional curvature 
scalars, but F, jv 1S not the usual Yang-Mills field tensor. Instead, it equals: 


Pe = pRB Bra = Gu p) (19.74) 


Clearly, this is not the Yang-Mills tensor, and therefore B’’ cannot be the 
Yang-Mills field. Notice also that the structure constant of the gauge group /f/’. 
appears nowhere in our discussion, so we are missing some essential element. 

At this point, we must make one more assumption that is not so obvious at 
first. We will make the assumption that the manifold has a symmetry associated 
with it; that is, we say that the manifold has an “isometry.” On manifolds with 
isometry, we can also extract a “Killing vector” ¢,, that mathematically expresses 
the effect of this isometry. 

For example, if a manifold loses all its dependence on the kth coordinate, then 
it has a symmetry that is mathematically expressed as 0,g,,, = 0. The generator 
of this symmetry is labeled by L,, = 5; 4, = ,. The Killing vector is then defined 
as £4" = §'. Covariantly, it satisfies the equation: 


Vice iGu= 0 (19.75) 


which is sometimes taken to be the definition of a Killing vector. 

One example of a manifold with a isometry and a Killing vector is a two- 
dimensional torus. Its isometry is the set of rotations in the azimuthal angle @ 
about its vertical axis. Its Killing vector is dg. Another example is the two- 
dimensional sphere S,. Its isometries consist of rotations in three dimensions 
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about its center. The set of motions generated by these rotations is, of course, the 
Lie group S O(3). 

If we have an arbitrary manifold G with a set of isometries associated with it, 
then these isometries will in general generate a Lie algebra associated with these 
symmetries. Let us say that the generators of this symmetry are described by: 


OE On (19.76) 
such that they, by definition, generate a Lie algebra: 
[La, Lol = foc (19.77) 


where f;”. is the structure constant of a Lie algebra. Inserting the value of L, into 
this equation, then we find: 


ba Onon — ob Ono = neon (19.78) 
With this Killing vector, we can now define: 
By = 6a, (19.79) 


This is the redefinition we were seeking, where A%, is the true Yang-Mills vector. 
Inserting this back into the F uv tensor, we get: 


| Henn aM dine aoe (19.80) 


where Fi, is the true Yang-Mills tensor, and the higher dimensional action con- 
tains the Yang-Mills action. It is now straightforward to show that the original 
action in (4 + N)-dimensional space splits up into two parts, the usual four- 
dimensional theory of Einstein and the standard Yang—Mills theory. 

Now let us return to the expression we previously derived for the dimensional 
reduction of Rag. The key idea is that now we can perform the integration over 
y, yielding: 


| d™ y./detymn(Y) Ymn(WE2 (VERY) 4 Qn bab (19.81) 


where {2 is the volume over the y space. This is the last step in the construction 
of the Yang-Mills action from the Einstein—Hilbert action in 4 + N dimensions. 
The lesson learned from this exercise is that we cannot simply take a (4 + N)- 
dimensional manifold M, ® G and expect the Yang-Mills theory to emerge. The 
extra assumption that we need is that the manifold G has a set of symmetries 
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associated with it which generate a Lie algebra. Then the Yang—Mills field 
emerges as a function of the Killing fields of the isometries. 

All this, of course, is formal, but let us see whether any possible phenomenol- 
ogy is possible with Kaluza—Klein Yang-Mills theories. Several questions come 
to mind immediately: 


1. Can the Standard Model gauge group be included in this scenario? 
2. Can complex representations of fermions be included? 

3. Is the theory renormalizable? 

4. Why should higher-dimensional space compactify? 


5. What about the cosmological constant? 


To answer the first, we will use the fact that the Yang—Mills theory is generated 
by isometries of the space-time manifold. Our task is to find the manifold that 
has the group of the Standard Model as its isometry group.'® 

The isometry of the circle S$; is easy to find; it is represented by a simple 
rotation about its axis, which can be obtained via §O(2) or U(1). The isometries 
of the ordinary sphere S2 can be obtained via rotations, labeled by S$ O(3) or SU(2). 
In general, the isometry group of S, is given by SO(n + 1). This is easy to see, 
because the defining equation of S,, is given by: 


x=] (19.82) 


I 


which is invariant under § O(n + 1) rotations on x;. 

Likewise, the isometry group SU(3) can be obtained via C P2. [C P,, is the 
complex space spanned by n + | complex coordinates z;, such that the point 
{z1, Z2,+-+, Zn} is identified with the point {Az), Azz, ---, Az, } for nonzero com- 
plex A. Notice that this definition is invariant under SU(n + 1) rotations.] 

We are therefore interested in the isometries of the 4+ 2+ 1 = 7 dimensional 
manifold: 


C Pz ® S2 @ Sy (19.83) 


Thus, 4 + 7 = 11 is the minimal number of total dimensions that we must have in 
order to have a Standard Model gauge group. 

Now that we have successfully shown that a class of seven-dimensional man- 
ifolds exists that can reproduce the isometry group of the Standard Model, our 
next step is to ask whether this formalism can reproduce the complex fermions of 
the Standard Model. 
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Here, however, our formalism fails. There are powerful mathematical theo- 
rems that, in fact, forbid complex representations of fermions in this approach. 
This is disappointing, because it means that the Kaluza—Klein approach is not rich 
enough to support the fermionic representations of the Standard Model. 

To see this, we first study the Dirac operator defined on a (4 + N)-dimensional 
product manifold, which splits up into two pieces: 


iT4 Dg =iy"D,(x) +iy™Dm(y) (19.84) 


where each covariant derivative depends crucially on the structure of the manifold 
(via the vierbein and the connection). In general, we are looking at the eigenvalues 
of the Dirac operator. If the Dirac operator on the B manifold has eigenvalue m, 
then we have: 


iT4D, =iy"D,(x) +m (19.85) 


However, the mass m is of the order of the Planck mass, which is much larger 
than the experimentally observed lepton and quark masses. Therefore, we must 
set m = 0, meaning that we must look at the zero eigenvalue of the Dirac operator. 

However, there is the Atiyah—Hirzebruch index theorem, which states that 
manifolds that have zero eigenvalues of the Dirac operator can only have real 
representations of fermions. 

This theorem leaves us with only a few options. Either we adopt complicated 
modifications of Riemannian manifolds in order to avoid this theorem, or we drop 
Riemannian manifolds entirely, and study supersymmetric and superstring-type 
theories. 

But perhaps the most serious problem with quantum gravity and quantum 
Kaluza—Klein theory is that they are all nonrenormalizable. We now turn to this 
problem, which has baffled physicists for over half a century. Over the years, 
a number of alternative approaches have been proposed to renormalize gravity, 
none of them very successful. For example, let us assume that general relativity 
is an “effective theory,” and assume that we introduce counterterms to cancel 
divergences at each order. Since we wish to preserve general covariance, the 
counterterms will be of the generic form R*, R*, R*,..., where R is composed 
of the Riemann curvature tensor. Because these higher terms contain higher 
derivatives, a theory of this type can be shown to converge sufficiently rapidly 
to be renormalizable. However, the modified theory is no longer unitary. R? 
has four derivatives in it, which leads to a theory with unitarity ghosts. (This is 
not surprising, since the higher R terms act like a Pauli—Villars cutoff, which we 
know introduces ghost states.) In other words, we gain renormalizability but lose 
unitarity. 
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19.9 Quantizing Gravity 


To see why general relativity is not renormalizable, it is first important to explain 
how to quantize the theory. We begin the process of quantization by power 
expanding the metric tensor around some classical solution at) of the equations 
of motion: 


Sel oily (19.86) 


where /1,,,, is the graviton field and « ~ \/Gy. The classical metric rae is usually 
taken to be the Lorentz metric. Given this expansion, we can also expand the 
Christoffel symbols, and hence the entire action, in a power series in h,,,. Each 
term of the power series contains two derivatives and an increasing number of h,,, 
fields and powers of the coupling constant. The action is nonpolynomial. The 
existence of a dimensional Newton’s constant, then, is the origin of the problem 
of the nonrenormalizability of gravity. 

Although the theory is not renormalizable, one can still study its Feynman 
rules and scattering matrices to lowest-order. The Feynman rules for the graviton 
propagator can be obtained by extracting the lowest order term quadratic in the 
graviton h,,, field. 

The Lagrangian, in this approximation, reduces to: 


Zo = = [—(hpo)? + (GAB)? — 20,h8 4, h7% + 2phya dh”? | (19.87) 


Ale 


(where raising and lowering of indices is now performed by the flat Minkowski 
metric). If we make a gauge choice, we can simplify this a bit. We can, of course, 
add a term: 

Boas 1 


=C,;; Cr=on,— 5 8uhy (19.88) 


to the action to break the gauge. 
The sum of both the original Lagrangian and the gauge part simplifies the total 


action to: 


1 
1 1 
Voouv = 7 bender — 7S po Suv (19.89) 


We can now invert the matrix V,¢,,, to obtain the final propagator. (One can 
check that the propagator is singular if the gauge-breaking part is missing.) We 
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find the result for the propagator: 


k? +i€ 
The calculations for the higher vertices, however, are prohibitively difficult. 
We must use special gauges and special tricks in order to reduce the number of 
possible interaction terms. 


19.10 Counterterms in Quantum Gravity 


Although quantum gravity is formally nonrenormalizable, we can still hope that 
(by a series of miracles) the divergences of the quantum loops cancel, leaving us 
with a finite theory. Usually, miracles occur because of a local symmetry, leading 
to Ward identities that cancel certain unwanted graphs. For quantum gravity, 
however, we have no more symmetries by which to cancel the higher-loop graphs. 

For cancellations to happen, higher-loop counterterms must be forbidden by 
some unknown mechanism. If we can show that these higher-loop counterterms 
cannot exist, then the theory might have a chance at being finite. Let us first 
enumerate the total number of one-loop counterterms that are invariant. The total 
number of counterterms that are invariant is just three, given by the set: Raa 
R7,, and R?. 

In the background field method (see Exercise 14.17), the counterterms are 
gauge invariant, and we are allowed to eliminate some of them via the equations 
of motion. If we set T,,, = 0, then Ryy — SeuvR = 0, which implies R,,, = 0. 
Thus, we are left with only one possible counterterm: — The question is: 
Can some unforeseen identity or symmetry prevent this invariant from appearing 
as a counterterm? If so, then general relativity would be one-loop finite even 
without computing a single Feynman diagram. 

It turns out that the answer is, indeed, yes. There is an identity, the Gauss— 
Bonnet identity, that allows us to eliminate this last invariant as a possible coun- 
terterm. 

To see this, we first note that, as in Yang—Mills theory, there is a topological 
invariant corresponding to the square of curvature tensors: 


Total derivative = €7°4«""?? Ra? Red (19.91) 


We know how to reduce out the product of two antisymmetric constant tensors. 
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We find: 


enh nies Se YI ete ee” (19.92) 
P 


where e is the determinant of the vierbein and we sum over the permutations in 
the indices, which preserves the antisymmetries of the antisymmetric tensor. (The 
left-hand side of this expression is a pure constant, while the right-hand side is a 
function of x. However, one can show that the x dependence of the right-hand 
side drops out.) 

Plugging this expression into the original one, we find: 


Total derivative ~ 4e(Ryvpo RY’? — 4Ryy RY” + R) (19.93) 


This means that any counterterm that may appear at the one-loop level can be 
eliminated. The second and third tensors are eliminated by the equations of 
motion, and the first tensor is eliminated by the Gauss—Bonnet identity. 

This is a truly remarkable result, indicating that quantum gravity is less diver- 
gent than previously expected. However, this fortuitous cancellation is actually 
an accident that does not generalize to higher loops. For example, at the two-loop 
level, it has been shown by computer that the following term cannot be cancelled 
by the equations of motion or any known identity!”'8: 


1209 1 
~ ©2880 (énd2© China Cuaus Case (19.94) 


(where 1/e represents the usual divergence found in quantum field theory, and 
where the C,,,ag tensor is the Wey] curvature tensor, which is composed of Rie- 
mann curvature tensors). The fact that this term does not cancel indicates that 
perturbative quantum gravity, by itself, is not a finite theory. This is a great disap- 
pointment, which has retarded progress in quantum gravity. The final answer will 
require essentially new ideas to remedy this defect. 

Several approaches may be taken to this problem. First, one can still hope that 
the inclusion of matter fields will render the theory less divergent. Unfortunately, 
it can be shown that if we couple spin 0, 1/2, and 1 fields, then the theory 
becomes even more divergent. Even the first loop cancellation via the Gauss— 
Bonnet identity is spoiled, and quantum gravity becomes a divergent theory when 
coupled to matter. 

Second, one might hope that coupling gravity to a spin 3/2 field may render 
the theory less divergent. In the next chapter, we will see that a miracle does, 
in fact, occur for this theory, called supergravity, at the first- and second-loop 
level if we couple quantum gravity to a spin 3/2 field. As one might expect, 
the cancellations occur because of a new symmetry in the theory, supersymmetry. 
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The Ward-Takahashi identities are sufficient to cancel a large class of divergences. 
Unfortunately, these identities are not powerful enough; supergravity appears to 
diverge at the third-loop level. 

Third, one may observe that the nonzero coefficient appearing in the divergent 
two-loop term is 209. This factorizes into (26 — D) x 19/2 for D = 4 dimensions. 
However, in 26 dimensions, this term might vanish exactly. The study of theories 
defined in D = 26 dimensions takes us into superstring theory, which we will 
study in Chapter 21. 

In summary, we have seen that the equivalence principle naturally leads to a 
generally covariant description of gravity in terms of curved manifolds. When 
general relativity is combined with GUT theory, we find the theory of inflation, 
which gives a plausible but not conclusive solution to the flatness and horizon 
problems. Attempts to go beyond general relativity have led to renewed interest 
in Kaluza—Klein theories, which unfortunately are neither renormalizable nor do 
they accomodate chiral fermions. Next, we will study perhaps the most nontrivial 
extension of quantum gravity, the supergravity theory and finally the superstring 
theory, which holds the promise of successfully uniting all interactions into one 
finite framework. 


19.11 Exercises 


1. Let the ae be independent fields, along with g,,,. Take the usual Lagrangian, 
/—gR(T), except keep the Christoffel symbols as independent fields, not 
related to the metric (this is called the Palatini form of the action). Prove that 
the equations of motion for the Christoffel symbols yields the usual identity 
Eq. (19.17), and hence the Palatini action is identical to the usual one, at least 
classically. 


2. Do the same for cou Prove that if the connection is an independent field and 
the action is taken as det e ete” RO (w), then the equations of motion for the 
connection are identical to its usual definition, given by Eq. (19.42). Unlike 
the Christoffel symbol, the connection wt? is a generally covariant vector. 
Prove this. 


3. To lowest order in x, show that the lowest-order quadratic term in h,,, arising 
from the linearized Einstein action equals Eq. (19.87). Prove that it is not 
invertible, so that a propagator does not exist unless we fix the gauge. 


4. Prove Eq. (19.13). 


5. Prove that, as c — oo, that the Einstein equations of motion reduce to the 
usual Poisson equations for a gravitational potential ¢ in the presence of a 
source /. 
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6. 


10. 


IL: 


12 


13: 


14. 


13} 
16. 


We 


By varying the Einstein—Hilbert action, show explicitly that the equations of 
motion for Einstein’s equations are R,,,, = 0 (without matter fields). To show 
this, you must prove that the terms containing 5R,,, in Eq. (19.26) can be 
dropped when calculating the equations of motion. 


. Choose the harmonic gauge, where: 


1 
55 nh)” (19.95) 


is added to the action for arbitrary a. Calculate the graviton propagator, and 
the Faddeev—Popov ghost term. Compare the result with Eq. (19.90). 


. Prove that €7”°4e#"*? R42? (w) R&4(w) is a total derivative. 


. Starting with Maxwell’s and Dirac’s equations coupled to gravity, show that 


the metric tensor couples to the energy-momentum tensor of the Maxwell 
field and the Dirac field. 


For the Kaluza—Klein theory, show explicitly that the five-dimensional Ein- 
stein—Hilbert action reduces to the usual four-dimensional Einstein—Hilbert 
action coupled to the Maxwell action, in the limit that the radius of the fifth 
dimension becomes large. 


Construct explicitly the Kaluza—Klein decomposition of a theory where the 
isometry group is O(N) and extract the Yang—Mills theory. 


Insert a cosmological term. Show that the radially symmetric solution of Eq. 
(19.44) necessarily gives an exponential expansion (i.e., de Sitter space). 


Prove, for an arbitrary matrix M: 
5 (det M) = (det M)(M~')'/5M;; (19.96) 


Using the expansion gy) = Ny» +K/Ay,y, find the exact relationship between 
Newton’s constant G and x. 


Prove that the metric in Eq. (19.72) is the inverse of the metric in Eq. (19.71). 


Prove that the action in Eq. (19.73), with the proper Killing vectors, yields 
the usual Yang—Mills theory after dimensional reduction. 


Power expand the Einstein—Hilbert action and explicitly derive all cubic terms 
in h,, in the harmonic gauge. Also, for the quartic and quintic terms, count 
the total number of ways in which four- and five-graviton fields and two 
derivatives may be contracted onto each other. From this, one can appreciate 
the complexity of doing calculations in quantum gravity. 


Chapter 20 
Supersymmetry and Supergravity 


Supersymmetry is an answer looking for a problem. 
—Anonymous 


20.1 Supersymmetry 


Supersymmetry has a long and interesting history. Apparently, the first known 
mention of a supersymmetric group was by Myazawa,! who discovered the su- 
pergroup SU(M/N) in 1966. His motivation was to find a Master Group that 
could combine both internal groups and noncompact space-time groups in a non- 
trivial fashion. Supergroups, in fact, are the only known way in which to avoid 
the Coleman-Mandula theorem, which forbids naive unions of compact and non- 
compact groups. Unfortunately, this important work was largely ignored by the 
physics community. 

Supersymmetry was rediscovered in 1971, from two entirely different ap- 
proaches. In the first, the Neveu-—Schwarz—Ramond superstring '~* was found to 
possess a new anticommuting gauge symmetry. From this, Gervais and Sakita ? 
then wrote down the first supersymmetric action, the two-dimensional superstring 
action. The second approach was that of Gol’fand and Likhtman, * who were 
looking for a generalization of the usual space-time algebra and found the super 
Poincaré algebra. 

In 1972, Volkov and Akulov * found a nonlinear supersymmetric theory. And 
finally in 1974, Wess and Zumino © wrote down the first four-dimensional point- 
particle field theory action. 

Although a wide variety of supersymmetric actions were then discovered in the 
1970s, for many years supersymmetry was considered a mathematical oddity, since 
none of the known subatomic particles had supersymmetric partners. However, 
its possible application to quantum physics came when attempts were made to 
iron out the inconsistencies of GUT theories. 
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In the previous chapter, we saw that one of the theoretical problems facing 
the GUT theory was the hierarchy problem; that is, renormalization effects will 
inevitably mix the two mass scales in the theory, the GUT scale MZ and the 
electroweak energy scale Mj,. Thus, even if we fine-tune the theory at the 
beginning to one part in 10!, they will still mix, ruining the separation between 
these two mass scales. This means that we have to perform an infinite number of 
distinct fine-tunings for each order in perturbation theory, which is undesirable. 

One appealing solution to the hierarchy problem is to include supersymmetry, 
both local and global. There are powerful nonrenormalization theorems in super- 
symmetric theories that show that higher order interactions do not renormalize the 
mass scale; that is, we do not have to fine-tune these parameters to each order in 
perturbation theory. One fine-tuning at the beginning is enough. This does not ex- 
plain where this original fine tuning came from; it only explains why higher-loop 
graphs do not mix the two mass scales. 

There are, however, many other reasons for examining supersymmetric theo- 
ries. One of the main problems in building unified field theories is the inability to 
find a gauge group that can combine the particle spectrum with quantum gravity. 
The problem is the no-go theorem, which states that a group that nontrivially 
combines both the Lorentz group and a compact Lie group cannot have finite 
dimensional, unitary representations. This means that attempts to build a “master 
group” that combines both gravity and the particle spectrum face an insurmount- 
able difficulty. 

There is, however, a way to evade the Coleman—Mandula theorem, and that 
is to use supersymmetry. Since anticommuting Grassmann numbers were never 
contemplated in the original derivation, the no-go theorem breaks down. The 
Coleman—Mandula theorem never analyzed a nontrivial symmetry that mixes 
bosonic and fermionic fields and places both in the same multiplet: 


Bosons «> Fermions (20.1) 


Thus, there exists a supersymmetry operator Q that converts boson states |B) into 
fermion states: 


Q|B) = |F) (20.2) 


As a consequence, electrons can appear in the same multiplet as the Maxwell field. 
In fact, there is the possibility of placing all the known particles found in nature 
into the same multiplet. 

Perhaps one of the most remarkable aspects of supersymmetry is that it yields 
field theories that are finite to all orders in perturbation theory. In particular, we 
will outline the proof that the N = 4 super Yang—Mills theory, and certain versions 
of the N = 2 super Yang-Mills theory, are finite to all orders; that is, Z = 1 for 
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all renormalization constants. ’~'° This is a surprising result, which indicates the 
power of supersymmetry in eliminating many, if not all, of the divergences of 
certain quantum field theories. 

Yet another attractive feature of the theory is that once supersymmetry becomes 
a local gauge symmetry, it inevitably becomes a theory of gravity. This new theory, 
called supergravity, '*-'> has a new set of Ward identities that render the theory 
much more convergent than ordinary gravity. In fact, the largest supergravity 
theory, which has $O(8) symmetry, is almost big enough to accomodate ail the 
elementary particles. 

We should caution the reader, however, about the limitations of supergravity 
as well. Although supergravity is not as divergent as ordinary gravity, the theory 
still is not finite. Local supersymmetry, by itself, is not powerful enough to cancel 
all divergences of the theory. Second, the group $O(8) cannot (without extra 
bound states) include all the particles of the Standard Model. 

To remedy some of these problems, we will have to go to yet another, more 
powerful theory, the superstring theory. 


20.2 Supersymmetric Actions 


We would first like to show that supersymmetry forces us to have equal numbers 
of bosons and fermions. The simplest example is the Hamiltonian: 


H =a,a'a+o,b'b (20.3) 
where we have bosonic and fermionic harmonic operators that obey: 
[a,at] = {b, bt} =1 (20.4) 
The supersymmetric operator Q is defined as: 
Q=blatatb (20.5) 


If at|0) is a one boson state, then Qa’ |0) becomes a one fermion state, and vice 
versa. Q obeys the following identity: 


[Q, H] = (wa — &)Q (20.6) 


If @q = wp = , then the supersymmetric operator Q commutes with the Hamil- 
tonian and: 


{Q, Q'}= =H (20.7) 
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These identities show that Q and Q? form a closed algebra with the Hamiltonian if 
the fermions and bosons have equal energy. The unusual feature of these identities 
is that the supersymmetric generator Q, in some sense, is the “square root” of the 
Hamiltonian. Furthermore, this highlights the fact that supersymmetry closes on 
space-time transformations. In this sense, it is radically different from the other 
symmetries that we have studied so far, which have treated space-time and isospin 
as entirely unrelated. 

Another unusual consequence of this simple exercise is that the energy of 
the vacuum must be zero in order to have supersymmetry. To see this, take the 
vacuum expectation value of both sides of the previous equation. In order to have 
a supersymmetric vacuum, we must have: 


Q|0) =0 (20.8) 
However, this implies that: 
(0|H|0) =0 (20.9) 


so that the vacuum must have zero energy. (This will have important implications 
later, when we discuss supersymmetry breaking. In broken theories, we will find 
that the vacuum energy becomes positive.) 

To use symmetry to construct new actions, let us examine the very first and 
simplest supersymmetric action that was discovered in 1971. This is the action 
found by Gervais and Sakita that describes the Neveu-Schwarz—Ramond super- 
string: 


Za PUiy" dws + 3,979" 6" (20.10) 


which is defined in two dimensions for real, Majorana spinors (and where a is 
an additional vector index that does not concern us here). The action is invariant 
under: 


by7 = ~iy"3,o% 
6¢° = €ew (20.11) 


There are several usual features of this action. We first notice that the super- 
symmetric parameter € is a anticommuting spinor. This means that many of the 
classical theorems concerning Lie groups and Lie algebras no longer hold. Sec- 
ond, the fermions and bosons have the same index a; that is, they must transform 
under the same representation of some group. (This will have important implica- 
tions for the theory of super GUTs, because the fermions usually transform under 
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the fundamental representation, while the Yang-Mills field transforms under the 
adjoint representation. Super GUT theories, therefore, cannot easily place the 
quarks and gauge particles in the same representation.) 

Third, we notice that if we anticommute the fields a second time, we find: 


[51, 521° = €1y"e2PL gb" — (1 - 2) (20.12) 


These commutation relations mean that there exists a spinor operator Q, whose 
anticommutation relations with itself yield the translation operator P,. This 
generalizes the discussion we found earlier, where Q formed a closed algebra 
with H. Now, we find that the supersymmetric generator forms a closed algebra 
with the vector P,,. 

The previous action was written down in only two dimensions. To obtain a 
four-dimensional theory, we now study the free Wess—Zumino action, where we 
again have Majorana spinors: 


1 a 
S=5 fas [(d, A)? + (8, BY +ipy"d.y + F? + G7] (20.13) 
This action is invariant under: 
6A = Ep 
6B = €Eysy 
bF = iey,0' 
6G = i€ysy,o"’w 


by = —iy" (0,A+750,B)€ —(F + ysG)e (20.14) 


This action contains equal numbers of fermions and bosons, as desired. There are 
four components within the off-shell Majorana field yw, and four boson fields A, 
B, F, and G. It is easy to show that repeated variations of these 4 + 4 off-shell 
fields close linearly among themselves. The supersymmetric algebra is thus linear. 
However, because F and G are auxiliary fields, we can eliminate them from the 
action from the very start. After this seemingly trivial elimination, the resulting 
action no longer has equal numbers of fermions and bosons. The action is still 
invariant under a modified form of supersymmetry, although it is no longer linear. 
By taking two such nonlinear supersymmetric variations, we find that the algebra 
does not close. This may seem disturbing, until we realize that the term which 
breaks the closure of the algebra is proportional to the equations of motion. The 
algebra then closes on-shell; that is, we must use the on-shell equations of motion 
in order to close the nonlinear supersymmetric relations. 
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It is more convenient therefore to retain these auxiliary fields in order to 
maintain the complete off-shell, linear algebra. In fact, one of the most pressing 
and unsolved problems in developing higher supersymmetric actions is to find all 
the auxiliary fields that will linearize the supersymmetric gauge transformations. 
The problem of writing down higher supersymmetric actions, in fact, often boils 
down to the highly nontrivial task of finding all auxiliary fields that linearize the 
supersymmetry algebra. 

(It is also instructive to perform the on-shell counting of states for this action, 
to confirm that we have the same number of fermions and bosons on-shell. The 
Majorana fermion, which had four components off-shell, now only has two com- 
ponents on-shell. Likewise, on-shell the F and G fields vanish, leaving us with 2 
+ 2 fermion and boson states on-shell, as desired.) 

Supersymmetry also generalizes to gauge theories. For example, the following 
gauge action with a Maxwell field and a Majorana spinor is invariant under global 
supersymmetry. It is the supersymmetric counterpart of QED: 


1 a 1 
ce ie (-7FR + SV" av + 0°) (20.15) 


The fields transform under: 


dA, = 1€Yy 
by = (50 Fun - wD) 


6D = iéysy"duw (20.16) 


Once again, we have equal numbers of fermions and bosons off-shell. Off- 
shell, the Majorana field has 4 components, while the A,, field has 3 components 
(because one is eliminated by gauge fixing), and the D field has one component. 
We therefore have 4 + 4 fermions and bosons. On-shell, we also have the same 
number of fermions and bosons. The y field now only has two components, the 
D field disappears, and the A,, field has two components, so we are left with 2 + 
2 fields on-shell. When we generalize this to non-Abelian gauge transformations, 
we will find that the fermionic y must transform in the adjoint representation, the 
same representation as the gauge fields, since they all belong in the same multiplet. 

So far, we have been exploring actions written totally in terms of their compo- 
nent fields. This, however, becomes prohibitively difficult as we go to non-Abelian 
and gravitational theories. In order to systematically generate new supersymmetric 
actions, we now turn to a new formalism, the superspace formalism. 
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20.3 Superspace 


Unfortunately, the number of fields rapidly escalates for higher supersymmetric 
actions. Perhaps one of the most beautiful ways in which to compress the blizzard 
of indices that often appears in supersymmetric theories is through superspace. '® 
This construction postulates the existence of four antisymmetric coordinates 6, 
that form the superpartner of the usual space-time coordinate: 


{x", Oy} (20.17) 


Supersymmetry, acting on the superspace coordinates, makes the following 
transformation: 


xe —> x* +iéy*O 
O. 4 Oy t€ (20.18) 
In practice, the use of complex Dirac spinors leads to reducible representations 
of supersymmetry. In order to find irreducible representations, we will find it 


more convenient to use Majorana or Weyl spinors. We will therefore split the 
four-component spinor into two smaller spinors as follows: 


92 
= ( . (20.19) 
() 


(Because of this split of four-component spinors down to two-component spinors, 
we will, unfortunately, find that the number of indices for irreducible representa- 
tions proliferates considerably.) 

In this formalism, we will take a modified Weyl representation of the Dirac 


P (0) eile —i 0 on 
| re Cie ag Pra (20.20) 


o# =(1,0')=(0"),,; t=, —0') = 6” =o” (20.21) 


matrices: 


where: 


Then the typical spinor breaks up as follows: 


“() 
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vb = vly=(x* da) (20.22) 


In this representation, the spinors become reducible; that is, the four-spinor y 
has now been broken up into two two-spinors @ and yx, each of which forms a two- 
dimensional representation of the Lorentz group. The Lorentz group generators 
can be obtained by multiplying the old generators M,,, = 0, /2 by the chiral 
projection operators P+ = $(1 +iys). In this way, we can split the original 4 x 4 
Lorentz generators into two distinct 2 x 2 blocks. Each two-spinor then transforms 
under a 2 x 2 complex representation of the Lorentz group, which we can show 
is SL(2, C), the set of 2 x 2 complex matrices with unit determinant. We use the 
fact that: 


OG ly SLZIC) (20.23) 

Let the two-spinor 97 transform as the fundamental representation of SL(2, C), 
where a = 1, 2. The complex conjugate of these matrices generates an inequivalent 
representation of SL(2, C). We will label these two-spinors as §,, where ad = 1,2 
and the dot reminds us that the two-spinors transforms under the inequivalent 


complex conjugate representation of SL(2, C). 
We take the conjugation of spinors as follows: 


(C= 0": a= (20.24) 


€2= e? =—€j4>=—€ = +1 (20.25) 
so that: 


y4 = cP w,: yt = be" 
Wa Wena; Wa =€ pW? (20.26) 


Invariants under each of the two groups are given by: 


ox = "Xa = —baX 
ox 


iH) 
> 
<i 
Q- 
i] 
| 
! 
2 
><! 
> 


(20.27) 


and: 


6? =076,; 6? =6,0% (20.28) 
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In this way, the standard invariant yy for four-spinors decomposes as: 


; : 01 Pa 
w= mol? t)(*) 
1 0 x 


ex +xo (20.29) 


We also have: 
Viv we = x10" X2 + hid" do (20.30) 


In two-spinor notation, the supersymmetric transformation in superspace in 
Eq. (20.18) is written as: 
x*& —» x4+ie0"O —ida%é 
07 — @% +6? 
ae ers (20.31) 


Given this superspace transformation, it is now a simple matter to extract the 
operators that generate this transformation: 


eo 5 
Qa = 1594 - (06), Oy. 
= 0 
Qs = ~i55, +00"), dy (20.32) 


The supersymmetric algebra now reads: 


{Qa,O5} = 200ap Pu 

{Qc,Q5} = {Q2,Q,}=0 

(Qa, Myo) = 5 (Our)? Ovi (Oa, Mul = -5 Os (Bw)s 

[Qc, Pu] = 0; [0,, P,]=0 (20.33) 


where o4” = i[y", y”]. 

Using superspace methods, let us now construct a few representations of 
supersymmetry. We begin by constructing a vector superfield V(x, 0, 0) that is a 
function of superspace. Under a supersymmetric transformation, it transforms as: 


5V(x, 0) =i[eQ + OE, V] (20.34) 
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(Since the supersymmetry generator has spin 5s this means that the supersymmetric 
partner of any particle must differ by only spin . Supersymmetric multiplets can 
then be grouped into collections of particles, each differing from the other by spin 
4) 

These superfields have many nice properties. The most important is that the 
product of two superfields is again a superfield: 


Vi V2 = V3 (20.35) 


This can be simply checked by examining the transformation properties of both 
sides of the equation. Although this product rule is simple in superspace, written in 
component form it is highly nontrivial. Superspace thus gives a way of generating 
new representations of supersymmetry from old ones. 

By power expanding V(x, 6, 4) in a power series in 9 and 6, we find that 
the series terminates after reaching the fourth power of the spinor because of its 
Grassmann nature. Since V is real, the most general parametrization is given by 
a Taylor expansion in the Grassmann variables: 


V(x,0,0) = C—iby +ix6 — 50°(M ~iN)+ 50°(M +iN) 
— 60,8A" — i076 (3 = 55,3" x) 


+ i676 (: = 5013" x) = 5070" (p + 5#5C) (20.36) 


This is called the vector superfield because it contains an ordinary vector 
field A, (and not because the superfield has a vector index on it). The vector 
superfield has 8 fermionic fields contained within 4 and y as well as 8 bosonic 
fields contained within C, D, M, N, A,; so we have an equal number of fermions 
and bosons, as desired. 

Under a supersymmetric transformation, we have: 


V(x, 0,0) > V(x + ico, 6 —i0G"E,0+6€,0+2) (20.37) 


By power expanding the previous equation and then equating coefficients, we can 
calculate the variation of all the fields within the vector superfield: 


6C 


EYSX 
5x = (M+ysN)e —iy"(Ap + ysduC ye 


8M = &d—ipx) 
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5N = &ys(A —i9x) 
SA, = iepyypA+Ed,xX 
64 = —io’ed, Ay — yseD 
6D = —iépysdr (20.38) 
We can also introduce a new derivative operator: 


a) Zs 


iy (0"6) dy 
2 re) P 
Oy 564 SPU (60"), Ou (20.39) 


The importance of this derivative operator is that it anticommutes with the super- 
symmetric generators. We list a few useful identities of this operator that we will 
use extensively in this chapter: 


0; {Dz, Dz} =0 


{De Dy} 
{Da, D;} = 2i(0"),5 Pu 


D,D,D- = DaD,D. =0 


B°DD, <= DyD?D? 
D?D?D* = —16d%,D° 
D’D’D? = —-163/,D° 
[D?, D*] = ~160? —8iDgD (20.40) 


[Proving these formulas is not as formidable as one might expect. Once one 
establishes the anticommutator between D and D, the other relations follow. For 
example, the commutator between D* and D? can be evaluated by pushing all 
D,, to the right. Each time they pass a D,, we pick up 2i(0"),,0,. Thus, after 
pushing all D, to the right, we have: 


[D?, D7] = [D,e"D;, Dze7?D,] 


2(2i)(2i eco") 14(0” ba 80) + °° 


8Tr (o“eo"" €) a,3, +: 


—1682 oan (20.41) 
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where € is a 2 x 2 matrix ( a ; ) Finally, the identities involving six 


derivative operators are proven by multiplying [D?, D?] by D? or D?. In this 
way, all identities can be proved. ] 

Because D, anticommutes with the supersymmetric generator, we can apply 
it at will on any superfield to form a constraint, such as: 


Did =0 (20.42) 


This constraint does not spoil the transformation of @ under supersymmetry 
because D, anticommutes with the supersymmetry generator. A field that satisfies 
this constraint is called a chiral superfield. (An antichiral superfield satisfies 
Dab = 0.) 

It is simple to write down the solution to the chiral constraint equation: 


(x, 0, 0) = exp (—i10 99) o(x, 0); (x, 6, 6) = exp (1096) o(x, 8) (20.43) 


The problem of finding chiral superfields then reduces to the simpler problem of 
power expanding ¢(x, 6), which terminates after only three terms: 


o(x,0)=A+20p~ —O°F (20.44) 


Once again, the number of fermion and boson fields are equal. w contains four 
components, while A and F are complex scalar fields with four components in 
all. 

Written out explicitly, the variation of the fields is given by: 


6A = Ew 
iy = —eh ig, Aor e 
5F = —2id, wore (20.45) 


Given these vector and chiral superfields, we can now construct superfield actions 
that are manifestly supersymmetric. There are two ways in which supersymmetric 
invariant actions can be constructed, one for vector fields and the other for chiral 
fields. For vector fields, we simply integrate over all eight x, 8, and @ indices. 
This integration selects out the ““D” term that appears in the variation of the vector 
field. [The variation of the D term in Eq. (20.38) is a total derivative, and hence 
always integrates to zero. That is why D terms are always invariant.] For chiral 
fields, we only integrate over six variables, x and @. This selects out the “F”’ term. 
(The variation of the F term is also a total derivative from the above equation.) 
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The simplest action based on superspace is the Wess—Zumino action, given 
as: 


S= i aby x oo = ) d°x (us “k sm 2 x) +he,| (20.46) 


By integrating out the Grassmann variables, we retrieve the free Wess—Zumino 
action in Eq. (20.13) we wrote down earlier. 

Not only can we find a superspace formulation of the Wess—Zumino model, 
we can also find the superspace formulation of gauge theory. To introduce gauge 
invariance, we first notice that the vector field V contains the field A yw» Which will 
form the basis of a gauge theory. The variation of V, in turn, looks very much 
like a chiral superfield A, which contains the combination 0, A. 

What we want, therefore, is a real vector field V that transforms as: 


éV = -5 (A — A) (20.47) 


where A is a chiral superfield. It is easy to show that this variation contains the 
U(1) symmetry transformation dA, ~ 9,A. 

Now we wish to construct the counterpart of the Maxwell tensor F,,,, which 
is invariant under this transformation. Let us define a chiral superfield W,: 


W, = +D*D,V 
4 
Dae = 40 (20.48) 


where the last identity is important because it shows that W, is a chiral superfield. 
One can show that the Maxwell tensor is contained within W,. Then it is easy to 
show that: 


5W, =0 (20.49) 


where we have used the fact that both A and W, are chiral superfields. 
Our gauge-invariant action is therefore: 


4 ii d*x d’0 W°W, (20.50) 


which is invariant under both gauge and supersymmetry. 
Next, we must show that this yields the correct U(1) action when we perform 
the integration. In general, this integration is rather lengthy; so we will use a trick. 
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We will use up many of the degrees of freedom in the gauge transformation so 
that the theory is defined only in terms of the important fields. 

Since A and F within the chiral field A are complex, we will find it convenient 
to redefine these fields as A — A+iB and F — F +iG. Then under this gauge 
transformation, we have: 


D--—- D (20.51) 


so that the A field is the gauge parameter associated with the gauge field A,,. We 
can obviously use B, yw, F, and G to eliminate C, x, M, and N. This leaves us 
with a reduced vector multiplet: 


V =(0,0, 0,0, Ay, A, D) (20.52) 


We call this the Wess—Zumino gauge, '’ where we have partially used up the gauge 
degree of freedom within the chiral superfields, leaving us with only the gauge 
multiplet that includes the Maxwell field A,,. Placing the chiral superfield W, into 
the action, we obtain the original super Maxwell theory of Eq. (20.15). 

The generalization of this construction to the full non-Abelian theory is also 
straightforward. Let us define V = V7r°, where the rt? matrices generate some 
Lie group. The V transforms as: 


5V =ig(A — Al) (20.53) 
for some chiral superfield A. Then we also have: 
eV _, eiAg—2V gih (20.54) 
Then define: 
i= -;DDe™ De (20.55) 
If we include matter fields within a chiral superfield, then: 


oe ie, gb — geiA (20.56) 
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Then the coupling to the matter fields arises through: 
ge Vo (20.57) 


Let us now put everything together. The most general coupling between a 
superfield V and a matter superfield @ is given by: 


1 
F= ag i d*x d’0 W°W, + he. 


1 7 
re [ atsato ge" (20.58) 


After performing the Grassmann integrations, the action equals: 


| 1 le 
B= fas Tr ( — Fuk + slay" Va + aD: 


1 


ak 
Z 


VuAViAt 5V,BV"B + sbo"V ah + 5F + 50? 
— iA[B, D] — ip[A, A] —ivyslA, ni) (20.59) 


where the matter field is in the adjoint representation of the group. 
We also have the freedom of adding the most general renormalizable self- 
interaction of the @ field, which is at most cubic: 


1 1 
fas d°6 (0 aP mii PP; + 3 Stik PIP | PK ar he.) (20.60) 


where the terms in the interaction must be gauge invariant. 

At this point, we can make a few remarks about supersymmetry breaking. 
There are two known ways in which we can break supersymmetry spontaneously. 
We can simply add the gauge superfield V (from which we constructed the super- 
symmetric gauge action) directly to the Lagrangian: 


ZB Br+kv (20.61) 


The integration of this V field, of course, generates a “D” term (Fayet—Iliopoulos 
term).'8 Since the variation of this D term is a total derivative, we still have a 
supersymmetric theory. 

In general, this action creates a number of terms containing the D and F fields. 
Since they appear in this action as nonpropagating fields, we can eliminate them by 
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their equations of motion. After this elimination, we are left with quartic terms in 
A, which generates a new effective potential. We then treat this effective potential 
in the usual way: We hunt for new vacua that allow us to break supersymmetry 
or gauge symmetry by shifting the vacuum (0|A|0) 7 0. 

The difficulty with this procedure, however, is that V in general transforms 
in the adjoint representation of the group, and hence cannot simply be added into 
the action. This means that we must have extra U(1) symmetries so that V is 
invariant and can be added freely in the action. However, this is often not desirable 
phenomenologically. 

Another more promising way to break supersymmetry is to add a term, called 
the “F” or O’Raifeartaigh term, '° to the action, which is also supersymmetric. 

Let us add a chiral term W, called the superpotential, into the action. As an 
example, consider the superpotential in Eq. (20.60). After performing the d76 
integration, we are left with: 


1 
[as[er Fi, HAGE; + mj (CAGP; — 5 Vivi) 
+ gijx(AiAj Fe — Wit; Ad) +hc.| (20.62) 
Now let us solve for the equations of motion for the auxiliary F; field: 
—F,= hi +m;,A} 45 ijk Aj Aj (20.63) 


Now substitute this value for F, and F;" back into the action. The superpotential 
has now changed into the term: 


i 
= miki Wk — Sige Wiw; Ax +h.c. — V(A;, A;) (20.64) 
where the potential V(A;, A*) is given by: 


V=) AF (20.65) 
k 


As before, the elimination of the auxiliary fields F; and F* has generated an 
effective potential V(A;, A}) can shift the vacuum. This, in turn, generates a 
fermion mass via the Yukawa term. 

This simple exercise can be generalized for an arbitrary superpotential W(@), 
where @j are the scalar fields within W. By repeating the same steps, we can show 
that the elimination of F; and F* from the action generates the following potential 
term: 


1 AC aW(o) 2 
Dine Wij +h.) 53 eae (20.66) 
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Figure 20.1. In (a), the potential respects both supersymmetry and gauge symmetry. In (b), 
the potential breaks supersymmetry (because the vacuum state no longer has zero energy) 
but gauge symmetry is still unbroken. In (c), the potential preserves supersymmetry but 
breaks gauge invariance. In (d), the potential breaks both supersymmetry and gauge 
invariance. 


Pictorially, we can analyze how to break supersymmetry and gauge invariance 
with this potential. We recall from our previous discussion that supersymmetry 
is preserved if the vacuum has zero energy. This can also be generalized to 
show that if the vacuum has nonzero energy, then supersymmetry must be broken. 
From the previous expression, the potential V is positive definite. Thus, to have 
supersymmetry breaking, we need only show that some of the terms in V do not 
vanish, thereby giving the vacuum nonzero energy. 

In Figure 20.1, we see various possibilities for the effective potential. In 
general, there are four possibilities. Potentials can be generated in which gauge 
symmetry and supersymmetry are broken together or independently of each other. 

When this mechanism for spontaneous symmetry breaking is applied to model 
building, one problem is that we cannot put the gauge fields and matter fields in the 
same multiplet since they transform differently under the isospin group; that is, the 
fermions belong to the fundamental representation, while the gauge fields belong 
to the adjoint representation. Therefore, we must introduce superpartners for both 
the gauge fields and matter fields. We must therefore introduce the supersymmetric 
partners of the familiar particles: bosonic “squarks” and “sleptons” transforming in 
the fundamental representation, fermionic “gauginos”’ transforming in the adjoint 
representation, as well as “Higgsinos” and “Goldstinos.” Another problem is that, 
since we do not see supersymmetry experimentally among the subatomic particles, 
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we must be able to break supersymmetry at a sufficiently high mass scale so that 
these superpartners do not violate known experimental results. 

In addition, there are stringent mass relations that must be obeyed whenever 
we use the “F” to break supersymmetry. Using the form of the potential in Eq. 
(20.66), a careful analysis of the most general superpotential shows that the spin 
0, spin 5, and spin 1 mass matrix M; for spin i must obey the following relation: 


Tr(Mg — Mi. +3M{) =0 (20.67) 


This relation, unfortunately, is badly broken phenomenologically. It shows that the 
boson masses cannot be sufficiently heavier than the fermion masses as required 
in model building. Even if ““D” terms are added to action, they are unable to lift 
this requirement. The only known way to avoid this mass condition is to add 
soft breaking terms to the action (which is undesirable) or move on to a more 
sophisticated theory, supergravity, which we will discuss shortly. 


20.4 Supersymmetric Feynman Rules 


There are several advantages to deriving the Feynman rules *° for supersymmetric 
theories using the superspace formalism. First, the large number of component 
fields found within the superfield can be easily manipulated as a single block. 
Before superfields were introduced, calculations with the component fields were 
often long and tedious. Second, in the component formalism, the cancellations of 
certain divergent graphs are rather miraculous. In the superspace formalism, it is 
easy to see the origin of these cancellations. Finally, the usual rules of functional 
integration generalize naturally to the superspace formalism. 
To begin, we wish to find an expression for: 


Z(J, J) = / Do Dé exp (isi f a's a6 Jp+i fas d’6 i) (20.68) 


where J is the source term for a chiral field. 

There are two types of Grassmann integrations found in the action, over 
d’6 and over d*6. To derive the Feynman rules, we will find it convenient to 
convert the chiral integral over d*+x d*6 into an integral over d*x d40 . This is 
accomplished by remembering that taking an integral over a Grassmann number 
is the same as taking a derivative. We can show: 


[ee = [ a0? ao! = 50° 
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[<e a6 - | d6'd 6 = 7 5B (20.69) 


which hold when applied directly onto chiral supertields. 
Using the formulas in Eq. (20.40), we find: 


D’ D’$ = —16376 (20.70) 
since D,@ = 0. Then we write the chiral integral as: 


Dep 
d‘ d*6 = | 4 2 —_——_ 
/ x od d°xd°0 1602 i) 


(20.71) 
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The chiral integral over the mass term m@”, for example, can be converted to 
an eight-dimensional integral as follows: 


D? Dp? 
BP 7 AM abe 
[axaoo = [axaoo| Tarlo 
D? D? 
[axes 5 (+s20) 


— | d*xd*0¢@ a (20.72) 
892 


Putting everything together, we have, for the free Wess—Zumino action with 
an external chiral source J: 


ii [oe Sap (Pep) — wae a) 
[asa | 560 — 50 ma) 5a ( gan” 
D? D 
+3(- ae?) +7(- a) | 


fas d‘6 (Gutsy + vB) (20.73) 
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mD? /49;, 1 — (D?/892)J 


The functional integral can now be performed, leaving us with: 


ZOD) 


exp (-i | da®z a*2'BT(c')A~"B)) 


i 


exp ( ; fl d®z d8z' [5/@Aoe, 2) I(2") 


+ J(z)Ay(z, 2')J(2') + sterol, 2')J¢e')}) (20.75) 


where d®z = d*x d*@. The propagators are given by: 


i mD? 
ko = -->5— 
: 4 p?(p? —m?) 
Be) (20.76) 
je — tt! 
and where: 
— mD*/4(a2, +m”) 1 +m? D? D?/1607(32 +m”) 
A 
1 +m? D? D? /1692.(02 +m?) —mD? /4(d? +m?) 


(20.77) 


From these equations, we can write down the Feynman rules for a superfield 
theory. These rules will become crucial when we discuss nonrenormalization 
theorems. 


20.5 Nonrenormalization Theorems 


One truly remarkable feature of supersymmetry is the nonrenormalization the- 
orem. This makes supersymmetric field theories a laboratory in which to test 
ideas about quantum field theory that have more sophisticated renormalization 
properties than ordinary ones. 
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The reason why supersymmetric theories have better convergence properties 
than ordinary field theories is that the fermion and boson loops of quantum field 
theory appear with opposite signs and hence cancel. For example, one can show 
that the quadratic divergences of simple supersymmetric theories cancel among 
themselves, leaving only logarithmic divergences. Furthermore, one can show 
that the mass and coupling constant corrections are actually finite to all orders in 
perturbation theory, and hence only the wave-function renormalization constant 
Zy4 appears. 

One direct application of this is that the superpotential is not renormalized by 
higher-loop corrections, and hence any fine tuning of the potential will not receive 
any contributions from renormalization. There is no necessity of retuning the 
parameters at each order in perturbation theory. This gives us a potential solution 
to the hierarchy problem in GUT theory. 

The proof that these cancellations persist to all orders in perturbation theory 
is prohibitive in the component formalism, where a series of miracles occur that 
cancel the divergent corrections to the mass and coupling constant. However, 
the proof to all orders in perturbation theory can be easily performed using the 
superspace method. 

We recall that the propagators in the superspace formalism can be given by: 


A; 1 x 4). Ea 1 ! 
(O11) 9(2)) = pam? 0-8 (VQ)VQ)) = —58'@ — 6) 

= 1 SD: p ' 
(P(1)G(2)) = Api (20.78) 


The vertices can be read off the Lagrangian, with the additional insertion of a 
factor of —(1/4)D? [or —(1/4)D?] acting on the propagator for each ¢ (or ¢) line 
that leaves the vertex. There is also a factor of { d*@ at each vertex. Since we 
are integrating over a series of delta functions, we find that all @ integrations can 
be performed exactly. What is interesting is that, by simply counting Grassmann 
variables, we can show that f d?6 or [ d?6 cannot be the end product of all the @ 
integration. Only { d*@ f(@, 8) survives the integration process. 

However, we know that the masses and coupling constants all appear in the 
action via m* { d°6 ¢* or A f d?6 ¢°. Thus, corrections to these terms are finite. 
There is only the wave-function renormalization Zg function, which contributes 
to the mass and coupling constant renormalization via: 


m—+Z,'m; g7Z,7"¢ (20.79) 


Let us now analyze the terms that can be renormalized in gauge theory, that is, 
have the form f d*6 f(6,6). The degree of divergence of any graph (excluding 
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the superpotential and terms that contain f d?6 or f d?6) can be given as: 
D=2-E-I1 (20.80) 

where E is the number of external chiral lines and / is the number of massive 


internal chiral propagators. From this, one can show that the possibly divergent 
contributions are given by: 


[oe oo; [ae oV¢; [oe Vv: [ae eV. [oe VVV (20.81) 
We have used the fact that the dimensions of the fields and operators are given as: 


> ideo 2 (20.82) 


Nie 


[V]=0, [¢]=1; [Da]= 


Thus, all contributions (except for f d*6 V), are logarithmically divergent. 
The nonrenormalization theorem states that we only have wave-function renor- 
malization and gauge coupling renormalization constants Zg and Z,, and that they 
are logarithmically divergent. 

We should note that the quantity { d46 V, which is quadratically divergent, is 
gauge invariant only for U(1). Thus, for non-Abelian theories, this term is absent. 
(Also, if Tr Q = 0, i-e., the trace of the charges of the scalar particles is zero, then 
this term also vanishes.) 


20.6 Finite Field Theories 


One of the most remarkable properties of supersymmetry is that supersymmetric 
field theories can be finite to all orders in perturbation theory, which was once 
thought to be impossible. ’~'* In some sense, these theories answer Dirac’s old 
objections to quantum field theory, that renormalization theory was in some sense 
contrived and artificial. 

We will now construct the global SO(4) super Yang-Mills theory, which will 
turn out to be finite to all orders in perturbation theory, and then we will discuss 
the supergravity theory. 

The SO(4) Yang—Mills theory can be constructed by coupling the N = 1 super 
Yang-Mills theory, with the multiplet containing spins (1, 5), to three copies of 
supersymmetric matter, containing the multiplet with spins G, 0). To construct 
the theory, we start with the usual Yang-Mills multiplet (A,, y) and then add 
three more fermion fields, which generalize y into four fields y;. We must also, 
of course, add the corresponding scalar fields, which we choose to be self-dual 
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and anti-self-dual: 


1 1 
Ajj = 3 Suit Anas Bij = — 5 <j Bus (20.83) 
Then the § O(4) action becomes: 
1 ih, eels 1 ” l 
__— = Tr (- fuk +51 Dui + Du Ai D Aij + Dy Bij DY Bi 


is dies i 
"2 ai vi [w;, Aij]+ a virslY;, Bj) + 39 Ai Bail {Aij, By] 


1 1 
wh 6g lis: Axi [Aij, Agi) + 64 Bui Byil (Bij, Bul) (20.84) 
which is invariant under: 


SA = GW —eWit eijuew 


SB = sys Wy — EjysWi — €ijuecysvi 


1 
ov; = Glo Fuy + iy” DylAij + 5 Bi; )é; 
ie 
fs ai lAij — VsBij, Aye + VsBjrlex (20.85) 


The first indication that this action possessed remarkable renormalization prop- 
erties came from the renormalization group, where it was noticed that 6 vanished 
to the first-, second- and even third-loop level. 

For the single-loop 6 function, we have a slight modification of the result 
found for Dirac fermions: 


joe g> C2(G) 
1672 6 


{22 — 4v(M) — v(R)] (20.86) 


where v are the number of Majorana fermions or real scalar fields. (There is one 
Majorana fermion in the gauge multiplet. For each chiral superfield, there are 
two Majorana fermions and two real scalar fields.) This function vanishes for 
N = 4 super Yang-Mills theory, since there are three chiral superfields; therefore, 
v(M) = 14+3 =4 and v(R) =2 x 3 = 6. 

For N = | supersymmetry coupled to n chiral multiplets, the two-loop result 
is: 


: 5 
ee SC) aa’ es 
Pats 0) eGo) = NRE) 
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For N = 4 super Yang-Mills theory, we have n = 3; therefore, B also cancels at 
the two-loop level. 

Since then, there have been several different proofs showing that the theory is 
actually finite to all orders in perturbation theory. We will only summarize some 
of the arguments. One proof of the theorem rests on the fact that the N = 4 theory 
has the symmetry of the N = 1 and N = 2 theories as subsets. This yields a large 
number of constraints among the various renormalization constants, eventually 
giving us Z = 1 for all of them. Let us begin, for the moment, with the NV = 1 
super Yang—Mills theory coupled to supersymmetric matter fields. Let us couple 
enough N = 1 multiplets so that we have the same number of fields as the N = 4 
super Yang-Mills theory. However, although we have enough fields to construct 
the N = 4 theory, assume that the coupling constants are not correlated with each 
other as in the N = 4 theory, so we only have N = | supersymmetry. 

Our strategy will be to show that, as we gradually change the coupling constants 
so we recover V = 2and then N = 4 supersymmetry, the resulting renormalization 
constants also change, until they all reduce to Z; = 1 at the end. To construct the 
action, we will introduce a vector multiplet V and a matter multiplet ¥° in the 
adjoint representation. Also, we introduce chiral matter multiplets ¢% as well as 
¢». Then the most general superpotential one can write down for these fields is 
given by: 


b"¢b, + bb? + bo Us — mb%b, + g¢° Vb, (20.88) 


At this point, there is no correlation between the various coupling constants. 
If we now turn on interactions, then we have the following possible renormal- 
ization constants: 


Vv? = ZyV*; gy Ze8V 
o — Bie sie 
(go — Zi(67; m—>m+dbm 
(UY > Zui? (20.89) 


We can also show (because of the nonrenormalization of e?” and because of the 
properties of chirally supersymmetric interactions): 


Ze2y 13> (Zaz 


1 . (20.90) 


(1+5m/m)Z Zi, 


which come from the nonrenormalization theorems for e*” and for chiral super- 
space integrals. 
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Now we impose an additional constraint. We will change the coupling con- 
stants so that the matter multiplets and the N = 1 super Yang-Mills theory form 
N = 2 multiplets. This means that the gaugino from V and the spinor from Y 
form a doublet; so we have a new constraint on the coupling constants. Since the 
original coupling constants were arbitrary, we have the freedom to choose N = 2 
symmetry, which gives us the additional restriction: 


Z,=Z': 


’; Zy=Zy (20.91) 


which in turn implies: 
ZoZ,=1; dm=0 (20.92) 


At this point, we have eliminated all but three renormalization constants, one for 
each of the three matter multiplets. However, we still have the freedom to change 
the remaining coupling constants so that the matter multiplets and the super Yang— 
Mills theory form N = 4 super multiplets. This further restriction on the coupling 
constants implies a symmetry between all the matter multiplets, so that: 


Zo= Zi, Zip (20.93) 
However, from Eq. (20.90), this also implies that: 
Za (20.94) 


for all renormalization constants in the theory. Thus, the N = 4 theory is finite to 
all orders in perturbation theory. 

By somewhat similar arguments, one can show that the N = 2 theory is finite 
for all higher loop levels beyond the one-loop level (where it diverges). If we 
relax N = 4 supersymmetry but keep N = 2 symmetry, then we lose the condition 
that Zy = Z;, and hence lose finiteness. Hence an ordinary N = 2 theory is not 
finite. However, we still may be able to salvage this proof if we can find another 
way in which to make Z¢ = Z}. 

The way to patch up the proof of finiteness for N = 2 theories is to use a 
modification of the N = 2 theory, with real representations, in which a SU(2) 
symmetry emerges that rotates @, into ¢*. Because of this additional symmetry, 
we can now equate Zy with Z/,, and then the only divergences in this N = 2 
theory come from Zy, which is one-loop divergent. Thus, we have proved that 
the N = 2 theory, with real representations, is divergent only at the one loop-level 
and finite at all higher orders. 

However, we can modify the theory still further by adding more multiplets 
to eliminate the one-loop divergence, rendering the modified theory completely 
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finite to all orders. Once the one-loop divergence is eliminated, the modified 
N = 2 theory is finite to all orders. 

In summary, the N=4 super Yang—Mill theory and certain modified N=2 super 
Yang-Mills theories are finite to all orders. These are remarkable results that 
are totally unexpected. However, this property does not persist when we build a 
supersymmetric theory of gravity. Although supergravity is much better behaved 
than ordinary Einstein gravity, it is divergent at the three-loop level. 


20.7 Super Groups 


Now that we have accumulated some practice in constructing supersymmetric 
theories, let us now analyze more systematically the group theoretic structure of 
super groups. We know from the classical works of Lie and Cartan that a complete 
classification of compact Lie groups is possible. Kac has generalized this result 
and given us a complete classification of the super groups. 

Although there are many possible super groups, the only ones in which we 
are interested are the ones that generalize the standard space-time groups found 
in physics, that is, the Poincaré group and (for massless theories) the conformal 
group, SU(2, 2) = O(4, 2). Each group, in turn, is part of an infinite series of super 
groups, which are called the orthosymplectic Osp(N/M) and the superconformal 
SU(N/M) groups, respectively. 7! 

To see how these super groups are constructed, we recall that the orthogonal 
group O(N )is the set of all real orthogonal N x N matrices that leave the following 
form invariant: 


O(N):  x;x; = invariant (20.95) 


Likewise, the symplectic group is the set of N x N matrices that leave the 
following form invariant: 


SP(N): @nCmn9n = invariant (20.96) 

where Cy, is a real, antisymmetric matrix and the 6, are anticommuting numbers. 

The orthosymplectic group Osp(N/M) is the set of matrices that leave the 
following form invariant: 


Osp(N/M): — X:Xi + OmCmnO, = invariant (20.97) 


fori = 1,2,...,N andn = 1,2,...,M. Not surprisingly, the algebra of the 
orthosymplectic group can be decomposed into blocks that contain the matrices 
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of Sp(M) and O(N): 


(20.98) 


O(N) A 
ospin/a)=( aml 


where A and B are determined by the commutation and anticommutation rela- 
tions of the Jacobi identity. O(N) and Sp(M) are therefore subgroups of the 
orthosymplectic group: 


O(N) ® Sp(M) C Osp(N/M) (20.99) 


More concretely, we are interested in the group Osp(1/4), which is the gauge 
group of supergravity. It contains the symplectic group, Sp(4), which is isomor- 
phic to the de Sitter group, and contains the same number of generators as the 
Poincaré group, P,, and M,,,. Its commutation relations differ only slightly from 
those of the Poincaré group; that is, the commutator of two translations [P,,, Py] 
is proportional to the Lorentz generator divided by the square of a length, which is 
called the de Sitter radius. In the limit of infinite de Sitter radius, two translations 
commute, and hence we have the same commutation relations as the Poincaré 
group (see Exercise 14.7). 

We also point out that the second physically interesting super group is the 
superconformal group. The group SU(N), of course, is the set of unitary N x N 
matrices with unit determinant. They leave the following form invariant: 


SU(N): (u')*ul 8; (20.100) 


Not surprisingly, the superconformal group SU(N /M) is the group with ele- 
ments that leave the following form invariant: 


(u')*u!8;; +(0")* 2mnO" (20.101) 


ener: = 1,2, uw, N,7e= 1,2, ..., Mand whtteigg, = eb: 
SU(N /M) can naturally be decomposed into the following form: 


SU(N) @ SU(M) ® U(1) C SU(N /M) (20.102) 


Let us now be concrete about the generators of Osp(N/4). We know that this 
orthosymplectic group must contain the generators of O(N), which we call T;, as 
well as the generators of the symplectic group Sp(4), which can be represented 
by the usual Poincaré generators P,, and M,,,. We also have the supersymmetric 
generator Q,;, which now carries the O(N) index i, as well as the two-spinor 
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index a. Then the generators of the super group are: 


ear | ich Te: {On QO, ;} = 26;;(0") 5 Px 
{On On} = 2enZiys 10a, Os, =—2en2y 


[Qai,Muv] = 5 (wie Oni [Dats Mysl = 5 Oii(Gw)h 
[Qai, Tj] = (bj) Qax; (Qai, 7] = —;); an 
if. Pl = 0 el=0 
[Z;;, anything] = 0 (20.103) 


where é, are the structure constants for O(N), where the Z;; are certain linear 
combinations of the generators 7;: 


Zij =a i (20.104) 


for the constant matrices ai., and the matrices bf are Hermitian. 

Using simple arguments, one can show that §O(4) and $O(8) are the largest 
possible groups for super Yang—Mills theory and for supergravity, respectively. 
The proof of this important fact is rather simple. We know that the supersymmetric 
generator for SO(N) supergravity is given by Q',, where a is a spinor index and 
iis an SO(N) index, where i = 1,..., N. We also know from group theoretical 
arguments that the spectrum of supergravity states can be generated by hitting 
the lowest helicity state |—) successively with the Q!, operator. For the super 
Yang-Mills theory, the field with the lowest helicity is the spin | vector particle, 
while for supergravity it is the spin-2 graviton. 

If we act with this operator once, we have: 


Qi\-) > wi (20.105) 


In the super Yang—Mills theory, this means that the partner of the Yang—Mills field 
is a spin 5 field with isospin index 7. In the supergravity theory, it means that the 
superpartner of the graviton is a spin 3/2 gravitino that also has an isospin index 
re 

Similarly, we can hit the lowest helicity state with two supersymmetric gener- 
ators: 


0',0%|—-) | (20.106) 


For the super Yang-Mills theory, this means that the spin 0 particle has two indices 
i, j that are antisymmetric. Thus, there are N(N — 1)/2 such scalar particles. For 
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the supergravity theory, this state corresponds to a vector particle that transforms 
as Aj,’, where i, j are antisymmetric. 

This procedure can obviously be repeated an arbitrary number of times, each 
time generating spinning particles with isospin indices that are antisymmetric in 
i, j,k, .... However, there is an important restriction. For the super Yang—Mills 
theory, we want the highest spin in the theory to be the Yang—Mills field. Thus, 
we can only hit the lowest helicity state |—) four times with Q,; until we arrive 
at |+), which is the highest helicity state corresponding to the Yang-Mills field. 
Therefore, the maximum orthogonal group that we can accomodate without going 
to higher spins is SO(4), because there are four half-steps in spin between the 
lowest and highest helicity state of the vector particle. 

Counting antisymmetric indices, it is easy to see that the helicities and number 
of states for the N = 4 multiplet are given by: 


Helicity : | —1 
(20.107) 
States : 1 


There are 1 + 6+ 1 =8 bosonic states and 4 + 4 fermionic states, so we have equal 
numbers of bosons and fermions, as expected in any supersymmetric theory. 

Similarly, we can only hit the graviton state with the lowest helicity |—) eight 
times with Q,; until we arrive at |+), the highest helicity state that also corresponds 
to the graviton. We must stop at this point, because an interacting massless spin 3 
theory is known to be inconsistent. Thus, this procedure must not generate spins 
beyond 2, or else we lose self-consistency. Since there are 8 half-steps between 
|—) and |+), the maximum number of generators Q/, must be N = 8. Thus, the 
highest symmetry group must be SO(8). This is rather unfortunate, because this 
group is too small to accomodate the Standard Model. 

If we count the helicity states as before, then the N = 8 multiplet has the 
following number of states: 


Helicity : | —2 
(20.108) 
States : 1 


Counting helicity states as before, we find that the number of antisymmetric 
indices i, j,k, ... in a p-rank tensor is equal to: 


( : ) (20.109) 
p 
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The total number of fields in the §O(8) theory is therefore given by 1 + 8 + 28 + 
56+ 70+56+ 28 +8 + 1 = 256. The number of bosonic states is given by 1 + 28 
+ 70 + 28 + 1 = 128. Likewise, the total number of fermionic states is given by 
8 + 56 + 56 + 8 = 128. So we have an equal number of boson and fermion fields 
(on-shell). 


20.8 Supergravity 


Up to now, we have only considered global supersymmetry. However, the real 
beauty of this approach emerges when we consider gauging the supersymmetric 
group to produce a gauge theory of a new type. In this way, we will see that 
supergravity necessarily emerges when we gauge the super group. 

There are many ways in which to formulate supergravity. The approach we 
will take will mimic the Yang-Mills approach as much as possible. We begin 
with the super group Osp(1/4), which has 14 generators. In addition to the four 
supersymmetric operators Q®, there are also the 10 generators of Sp(4), which 
can be arranged as in the Poincaré group, consisting of the Lorentz generators M,, 
and the translations P,. As in Yang-Mills theory, we will introduce a separate 
connection field for each of the generators of Osp(1/4). 

Let M, collectively refer to all the generators of Osp(1/4). They satisfy: 


[Ma, Mz} = fxgMc (20.110) 
where ‘en are the structure constants of the group, and we have both commutators 


and anticommutators in the algebra. 
Let wn collectively refer to all the connection fields. The fields transform as: 


bas = face? or, (20.111) 
where: 
On = {eon Vi} 
Ma, = {Pa,—iMa, 0%} (20.112) 


The covariant derivative is therefore: 


i 


Oy +4 Pa — iw? M™ + rye O" (20.113) 
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[Notice that gauging the translations P, necessarily introduces a connection field 
ef, which is the vierbein field. Thus, gauging the super group Osp(1/4) necessar- 
ily introduces the graviton field. There is no other choice. Local supersymmetry 


necessarily creates a theory of quantum gravity. ] 
The commutator of two covariant derivatives yields the curvature tensor: 


[Wess Vo) aie (20.114) 
where: 
Ri, = 0,0) — d,08 + whol fo (20.115) 


In component form, this reduces to: 


RAP) = des + aire? — (uo v) 
Ra?(M) = d@ 0) + a? sat eget) 
Ri(Q) = a+ polo” +hely* —(uo v) 
(20.116) 
The variation of the curvature is equal to: 
dRi, = Rie See (20.117) 


The action for supergravity is now given by contracting the curvature tensors 
via the antisymmetric invariant tensors: 


FZ = #7 TR (M)? Ro Cabed + Ruv(Q)* Roo(O)(ysC)ap] (20.118) 


where C is the charge conjugation matrix. If we make a variation of the action, 
we find that the action is not invariant unless we enforce the condition: 


Riy(P) =0 (20.119) 


The action, at first, appears to be a R*-type action, which is not unitary 
(because of ghosts). However, this is an illusion. The R? term is actually a total 
derivative and topologically invariant. Hence, it can be dropped from the action. 
The cross terms are linear in R and give us the supergravity action: 


1 1. 5 pvpo 
LA =a _ 5 Yury Do Wo€ (20.120) 
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(There is also a term proportional to e*, which comes from a generalization of the 
cosmological constant term divided by the de Sitter radius to the fourth power. 
We can drop this cosmological term if we set the de Sitter radius to infinity.) 

In a similar fashion, we can also introduce the curvatures for the group 
SU(2, 2/1), the superconformal group. By contracting products of these cur- 
vatures, one can write down a higher derivative theory that is locally superconfor- 
mally symmetric, called conformal supergravity. 77 

Although the N = 1 supergravity action was relatively easy to construct, there 
are severe complications when generalizing this to N = 8 supersymmetry. To 
construct the §O(8) action in the component formalism is prohibitively difficult. 
Even the superfield method is prohibitive in this case. Instead, we will construct 
the 5O(8) action by using a trick. We will formulate the N = 1 supergravity 
theory in 11 dimensions. Since the symmetry group for this higher-dimensional 
theory is only N = | supersymmetry, the action is easier to write down. Then 
we will use dimensional reduction or compactification to yield an N = 8 action in 
four dimensions. 

To see that the number of states formally is the same, let us count the number 
of states within an 11-dimensional supergravity. The counting of states proceeds 
as follows: 


eo tox 10-1244 


2 
wu 5(9 x 32 — 32) = 128 
9 
Amne — (3 )=* (20.121) 


where M, N represent 11-dimensional curved space indices, A, B represent flat 
space |1-dimensional indices, and the spinors are 32 dimensional. e/, represents 
the 11-dimensional vierbein linking the base manifold with the tangent space. yy 
is a graviton field, and Aywp is an antisymmetric tensor field. The total number 
of boson fields is 128, which equals the number of fermion fields, as it should. 
The total number of boson and fermion fields is thus 256, which is precisely the 
number of fields in the 11-dimensional N = 8 model. Thus, not only do we have 
equal numbers of fermions and bosons, we also have the same number that appear 
in the 11-dimensional action. 

With some work, one can show that the action for 11-dimensional supergravity 
is >: 


1 lp uN 7) A : 
Lo = ~x5eR ~ 5edul™"? Dyl5 +O) — oeFuneo 
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Res Sani (Vem TONERS Es PTE Milles -*) (F ae F)ypor 
ve ae 
= ai BOE acess Hinge M.A Meenas, A, (20.122) 


which is invariant under: 


1 
bey = 5Kil “Wu 
V2 
dAMNP = ——g an oP 
- , V2 
duu = «7 'Dy(d)n+ eae, ~— 857 T2"5)n Fpors 
(20.123) 
and where: 
‘ 1. 
@M ap = OmaB + gv Temasoy® (20.124) 


As mentioned earlier, the $O(8) action is too small to include the Standard 
Model. If we go to higher supergravity theories beyond $O(8), then we have 
interacting massless spin 3 fields, which are known to be inconsistent. This is 
disappointing. 

One alternative is to couple supergravity to supersymmetric Yang-Mills fields 
with the gauge group given by the Standard Model. The addition of supergravity 
gives us nontrivial corrections to the effective potential in Eq. (20.66), which 
allow us to relax the stringent condition in Eq. (20.67). Supergravity coupled 
to super Yang—Mills theory thus has interesting phenomenology. However, this 
coupled theory diverges at the one-loop level, making it less convergent than 
supergravity (which diverges at the third-loop level) or even ordinary quantum 
gravity (which diverges at the second loop level). Thus, supergravity coupled to 
a super Standard Model has phenomenologically good properties, except that it is 
highly divergent. 

In summary, we have seen that supersymmetric theories give us a theoretical 
laboratory to study field theories with radically different properties. 

First, they mix isospin and space-time symmetries in a nontrivial way, thereby 
evading the Coleman—Mandula theorem. This gives us the hope of eventually 
putting all subatomic particles in the same irreducible representation. 

Second, they can cancel enough divergences to render the N = 4 and certain 
N = 2 super Yang-Mills theories finite to all orders in perturbation theory. This 
realizes Dirac’s original dream of field theories that do not require renormaliza- 


tions. 
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Third, they can, in principle, solve the hierarchy problem. Supersymmetric 
theories have powerful nonrenormalization theorems that prove that the mass 
separation between GUT scale and low-energy physics is not renormalized. 

Fourth, their local version necessarily contains gravity. Supersymmetric Ward 
identities can remove the first- and second-loop divergences, although they fail 
at the third loop level. Unfortunately, the maximum supergravity has SO(8) 
symmetry, which is too small to include the Standard Model. In addition, the 
theory is not renormalizable. 

Faced with the divergence of supergravity, one is forced to enlarge the gauge 
group, hoping to generate enough Ward identities that can cancel all possible 
counterterms. The only known nontrivial generalization of supergravity is the 
superstring theory, to which we turn in the next chapter. 


20.9 Exercises 


1. In quantum field theory, we must throw away, by hand, the infinite zero point 
energy of the scalar and fermionic fields. Show that in a simple supersymmet- 
ric theory, the zero-point energies of the bosonic and fermionic fields cancel 
by themselves. 


2. Perform the integration over the Grassmann variables in the Wess—Zumino 
action in Eq. (20.46). Show that its free part is equivalent to the action written 
in terms of components in Eq. (20.13). 


3. Perform the integration over the action (20.50) in the Wess—Zumino gauge and 
show that we recover the supersymmetric Yang—Mills theory in Eq. (20.15). 


4. By direct computation, prove that the supersymmetric Yang-Mills theory in 
Eq. (20.15) is invariant under a supersymmetric transformation. 


5. Prove (only to lowest order) that the $O(4) Yang—Mills action in Eq. (20.84) 
is invariant under Eq. (20.85). 


6. Consider the constraint Riy(P) = 0 in Eq. (20.119). Show that this constraint 
is equivalent to the vanishing of the covariant derivative of the vierbein. 


7. Prove all the relations in Eq. (20.40). 


8. Write down an expression for the Noether (super) current for the Wess— 
Zumino model. 


9. In supergravity, show that the anticommutator of two supersymmetry varia- 
tions of the gravitino does not close properly, showing that the theory does not 
really form a group structure in the usual sense. (Show the presence of a few 


20.9. Exercises 697 


10. 


ne 


12: 


13; 


terms that do not close, not the whole expression.) Show that the noninvariant 
terms are, in fact, proportional to the equations of motion, so that they vanish 
on-shell. 


ey, has 10 degrees of freedom, and y, has 16 degrees of freedom, yet we 
know that, in the canonical formalism of supergravity, we are only left with 
two helicities for both. Show how, using gauge invariance, we can remove 
all degrees of freedom down to two helicities. 


In supergravity, the basic fields are the ef, and the gravitino w,. Perform the 
counting of states both on-shell and off-shell. Show that off-shell, there is 
mismatch of 6 fields, requiring auxiliary fields. 


To compensate for these 6 missing boson fields, let us add nonpropagating 
fields A, and S and P to the supergravity action: 


if 1 = 
La aseR — se UuysyiDpWo — (5° + P?— ALY (208125) 


Show (to lowest order) that this new action is invariant under: 


be = BEV Vu 
i 1 
OWy = (Du a8 7 Auysie = 5 Yune 
Sy 2 - RY 
ae y 
6P = —3ersy Fi Rv 
3i 1 
jA, = “eRe — 7 
4 3 
1 ; : 
n = —3(5 —iysP —ikys) (20.126) 
where: 
i 1 
RHO = MPT ysy,(Dp Wo — 5 AcYs¥p + 5 ¥onVo) (20:127) 


Show that the supergravity algebra with these auxiliary fields closes properly 
off-shell, without having to invoke the equations of motion. Calculate the 
closure of the algebra for the gravitino field. (Only calculate the lowest-order 
terms. Show that the terms that previously destroyed the closure of the algebra 
cancel.) 
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14. Show that, if we eliminate wot? in Eq. (20.120), we find that the connection 
field picks up a contribution from the gravitino field. Show that this new term 
is proportional to the torsion (i.e., ’),, — T'},,). 


15. Write down the supersymmetric version of SU(5) GUT, introducing separate 
superfields for the various representations in the theory. Do not spontaneously 
break the theory. 


16. Using superfield methods, show that the simplest one-loop graph in the Wess- 
Zumino model, after Grassmann integrations, is of the form { d*6 f (6, 8). 


Chapter 21 
Superstrings 


But the creative principle resides in mathematics. In a certain sense, 
therefore, I hold it true that pure thought can grasp reality, as the ancients 


dreamed. 
— A. Einstein 


21.1 Why Strings? 


At present, there is only one finite theory of quantum gravity, and this is the 
superstring theory. In this sense, the theory has no rivals. In addition, the theory 
can apparently reproduce all the known particle interactions found in nature. The 
fact that one can, in principle, construct solutions that include both gravity and 
all known interactions from such a simple physical picture, the string, is rather 


remarkable. 
The desirable properties of string theory, as usual, derive from its powerful 


gauge groups. The gauge groups of the superstring include: 


1. Conformal and superconformal invariance. These are the symmetries defined 
on the two-dimensional surfaces swept out by the string. 


2. General coordinate transformation. Being a theory of quantum gravity, it 
possesses space-time reparametrization invariance. 


3. Eg ® Eg. This gauge group emerges when we compactify some of the higher 
dimensions of the theory. 


4, Space-time supersymmetry. This symmetry helps to solve the hierarchy 
problem and cancel some of the potential infinities of the theory. 


The symmetries found in particle physics and general relativity therefore 
emerge as a tiny subset of the symmetries of the superstring. In addition to being a 
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finite theory of gravity, the theory also has definite phenomenological advantages 
over other theories: 


1. The group Eg ® Eg is large enough to contain GUT theory. Supergravity, 
by contrast, was limited by its isospin group SO(8), which was too small for 
phenomenology. 


2. The superstring can accomodate complex fermion representations like those 
found in the Standard Model because it is not based on Riemannian manifolds. 
Kaluza—Klein theory, being Riemannian, cannot accomodate these fermion 
representations. 


3. The superstring theory is completely free of anomalies. Gravity theory in 
higher dimensions, by contrast, has problems with anomalies once we have 
chiral fermions. 


4. The superstring theory gives a plausible explanation of the generation problem 
in the Standard Model in terms of certain topological invariants that exist on 
six-dimensional manifolds. GUT theory, by contrast, cannot explain the 
presence of fermion generations. 


5. The superstring theory has no hierarchy problem because of powerful non- 
renormalization theorems. The GUT theory, however, has a hierarchy prob- 
lem. 


We caution, however, that as with all theories defined at the Planck energy 10!” 
GeV, like quantum gravity, the superstring theory is subject to the severe criticism 
that it cannot be tested with present technology. Predictions of the theory, for 
example, that space-time was actually ten dimensional at the instant of the Big 
Bang, are beyond experimental verification. Unlike GUT theory, which yields 
testable predictions in the form of proton decay, it is difficult to find an experiment 
that can rule out (or in) superstring theory in the coming years. 

Our philosophy, as we said before, is to treat superstring theory as a theoretical 
tool, as an example of a field theory which has highly nontrivial features that can 
probe the limits of quantum field theory. Underlying the superstring theory is a 
genuine quantum field theory; from its Lagrangian, we can derive the standard 
quantization rules and find the spectrum of states and the Feynman-like rules. This 
quantum field theory of strings apparently satisfies all the nontrivial constraints 
postulated for the S matrix. This quantum field theory also contains perhaps the 
most sophisticated, self-consistent Lagrangian to appear in physics, and hence 
deserves serious study. If we take particular subsets of this action, then we can 
find the usual actions describing quantum gravity, supergravity, gauge theory, and 
GUTs. 
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With a few rather mild assumptions, one can also find classical solutions to 
the string equations that come surprisingly close'to the Standard Model, including 
three generations of fermions with a stable hierarchy. However, the theory also has 
an additional problem: It has millions of other solutions. It is not known how to 
select the true vacuum among the millions that have been found. A nonperturbative 
analysis is probably required to find the true vacuum of the theory, which is beyond 
our current calculational ability. The main problem facing superstring theory is 
thus theoretical, rather than experimental. If the true vacuum of the theory could 
be found theoretically, it should be possible to make a direct comparison with 
experiment. At that point, one can decide whether or not it correctly describes 
all quantum forces. But until the true vacuum is found, the theory does not have 
true predictive power. Until then, the superstring theory will remain a highly 
sophisticated quantum field theory without direct physical application. 

Because of the mathematical complexity of string theory, we will only sketch 
some highlights of the theory in this chapter. The reader is referred to the literature 
for a more detailed explanation of the theory. 


21.2 Points versus Strings 


String theory, at first appearance, seems strange because it was historically for- 
mulated as a first quantized theory, rather than as a second quantized field theory. 
Therefore, it will be instructive to examine the simplest first quantized system, the 
relativistic point particle, and later develop the second quantized theory of points 
and strings. 

Let the coordinate x“(t) represent a vector that points from the origin of our 
system to the location of a point particle. As the particle moves, it sweeps out a 
line, called the world-line, parametrized by t. The action is proportional to the 
invariant length swept out by the world-line: 


5 =—m f decir? ~ length (21.1) 
This action is invariant under reparametrizations of t: 
t — T(T) (202) 


To quantize the action, we first compute the momenta: 


me See 
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Ve)? 
Because of the last equation, the momenta are not all independent, signaling the 


presence of a gauge invariance. To find the precise dependence of the momenta, 
we take its square: 


(21.3) 


= —m 


pL =m (21.4) 


Finally, we now apply this constraint directly onto the state vectors |p) of the 
theory: 


(p?, — m*)|¢) =0 (21.5) 


as in the Gupta—Bleuler formalism. At this point, we recognize this as the wave 
equation for the usual Klein—Gordon equation. This equation, in turn, can be 
derived from the standard covariant second quantized action: 


S= ; / d*x (x) (8,0 + m7) (x) (21.6) 


In this way, we have made the transition from the first to the second quantized 
formalism for free point particles. 

Alternatively, the reparametrization invariance of the theory allows us to select 
the following gauge choice: 


ee (21.7) 


In this gauge, the action assumes the familiar nonrelativistic, ghost-free form in 
the limit of small velocities: 


S = -m f defi =¥ 
Z [ac sm? (21.8) 


Unfortunately, the first quantized formalism treats interactions in a rather 
clumsy fashion. To introduce scattering, we do not add an explicit interaction 
term, as in the second quantized formalism. Instead, we define the scattering 
amplitude by taking the path integral over a space-time configuration that has the 
topology of a Feynman graph. By summing over all such topologies, we arrive at 
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the complete scattering amplitude: 


N 
An(P1. P2.°**, PN) = ae J psicea(i fare ei 01-0) 


topologies j=! 
(21.9) 


By explicitly evaluating the integral, we find the product of a series of propagators 
Ar and vertices. In this way, we can reproduce the usual Feynman series. By 
specifying the topology of the graph, we can reproduce the Feynman amplitude 
for any #” theory. To see this, we go to the Hamiltonian formulation, where the 
constraint in Eq. (21.4) is enforced by a Lagrange multiplier: 


SF = pyk" — Ap? — m?) (21.10) 


By functionally eliminating p, and A, we can retrieve our original Lagrangian 
appearing in Eq. (21.1). 

Because we have reparametrization invariance, we have the freedom to fix the 
gauge by setting A = 1. The new Hamiltonian is therefore H(p, x) = P,, —m?*. The 
propagator in the Hamiltonian formalism is easily calculated. Between asymptotic 
states, it is given by the integral: 


rm 1 
/ de" = ——__ = Ar(p) (21.11) 
0 Pu 7m 


This reproduces the usual Feynman propagator. 

If we now perform the path integration over the entire graph, then the path 
integral yields the product of these propagators joined together according to the 
topology of the Feynman diagram. In this way, the Feynman rules for any $” 
theory can be reproduced in the first quantized approach. 

This simple example demonstrates that the Klein—Gordon theory can be for- 
mulated as a first quantized theory, but it is rather clumsy. In particular, the sum 
over the topologies of all Feynman graphs must be inserted by hand, which is 
undesirable. Also, unitarity is not obvious at higher orders. By contrast, the 
second quantized formalism is cleaner and can be derived from a single action. 

Now let us make the transition from the point particle to the string, repeating 
the same steps as before. When a point moves in space-time, it sweeps out a 
one-dimensional world-line. When a string moves in space-time, it sweeps out a 
two-dimensional sheet, called the world-sheet. 

Let X,,(o, T) represent a vector defined in D-dimensional space-time that 
begins at the origin of our coordinate system and ends at some point along the 
two-dimensional string world-sheet, as in Figure 21.1. We can place coordinates 
along the world-sheet labeled by &° = {g, T}. 
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Figure 21.1. X” is a vector that goes from the origin to a point on a two-dimensional 
world-sheet swept out by a string. 


Let thesmetric. yyy = (+)—»—, —---)sbesthe flatunetricain. D-dimensional 
space-time, where = 0, 1,2,..., D — 1. Let g% represent a two-dimensional 
metric defined on the surface. 


Our action can be written as!: 


1 2 b 
=— ab OX Xone 
Sa fe E gg” daXy 8X (21.12) 


where we define a’ = 1/2 for open strings and a’ = 1/4 for closed strings. a’ is 
equal to the Regge slope, which we will define shortly. The action is manifestly 
reparametrization invariant. If we reparametrize the two-dimensional world sheet 
according to: 


o — G(o,T), tT — T(o, T) (21.13) 
then the action is invariant under this two-dimensional general co-ordinate trans- 


formation if: 
- fae#\ (oF? 
gE) = ( =) (Fa )s“@) (21.14) 


Under this transformation, the action is manifestly co-ordinate invariant (be- 
cause the transformation of ,/g cancels against the transformation of the two- 
dimensional measure). 

Under an infinitesimal transformation, the transformation of the fields be- 
comes: 
ab 


dg = es One = (ae Be? = gk Get 
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6X, = €°0,Xy (21.15) 

The action is also trivially invariant under local scale transformations: 
ge —+ eb gt (21.16) 


Since the two-dimensional metric in the action does not propagate, we may 
eliminate it via its equations of motion. Then, we find: 


Sab © OgX py, OpX"” Qili7) 


Reinserting this value of the metric tensor back into the action, we find the original 
Nambu-—Goto action2~*: 


So= soe fate fxmnccuy — xy 


~ surface area (21.18) 


where X” equals 8, xX“ and X’“ equals 4, X”. 

It is remarkable that string theory, which provides a comprehensive scheme in 
which to unite general relativity with quantum mechanics and all known physical 
forces, begins with this simple statement: the first quantized action is proportional 
to the area of the string world-sheet. 


21.3 Quantizing the String 


To calculate the spectrum of the string and its properties, we will quantize the free 
theory using three different methods: 


1. The Gupta—Bleuler formalism in the conformal gauge. 
2. The light-cone gauge. 
3. The BRST formalism. 


21.3.1 Gupta—Bleuler Quantization 


The gauge degree of freedom allows us to choose the conformal gauge: 


Conformal gauge: 7” = 6% (21.19) 


706 Superstrings 


Then, our Lagrangian linearizes to the following: 


| v2 2 ! 
Zz a [(Xp)? + (X',)"] = a 8, X 4, 0,X" (21.20) 


where, after a Wick rotation, we have introduced the complex variable z: 


Z=O+iT Cit) 
The equations of motion are: 
a? a? 
(a5 e x) X, =0 (21.22) 


(In deriving these equations of motion, we had to eliminate a surface term; so 
we must also set X fi = 0 at the ends of the string.) The gauge-fixed action is no 
longer locally reparametrization invariant, but it is still globally invariant under a 
subgroup of reparametrization, conformal transformations: 

z— f(z) (21.23) 
Under conformal transformations, the string transforms as: 


5X,,(z, 2) = €8,X,, +€%X, (21.24) 


To quantize the system, we first introduce the canonical conjugate: 


—inuvd(o — 0’) 


bf 
Za | 21.25 
: 5X# ee 


[P.u(c), Xv(o’)] 


We can always decompose the string variable via the Fourier series: 
ae 
X*(o) = x*+i ¥ —~(a4 —a" )cosno 
za oe 


P*(o) 


1 co 
=p Pa 
= (» - Z Vn (ak +a") cos no) (21.26) 


where the commutation relations between the string variable and its momentum 
conjugate are satisfied if: 


[Gry Amy] = Sn,—m Nuv (21.27) 
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To calculate the spectrum of states, we calculate the Hamiltonian: 


eit [e.<) 
H = | do(P,X" — Z) =~ na_nyat +a p? (21.28) 
0 


n=} 


Fortunately, the Hamiltonian is the simplest possible operator for an extended 
object: the sum over an infinite set of independent harmonic oscillators. The 
eigenfunctions of the Hamiltonian are therefore just the products of free creation 
operators a_, ,, acting on the vacuum: 


] [{e-n..}10) (21.29) 


This allows us to display the spectrum of states, which correspond to an infinite 
tower of point particles of arbitrary spin. The lowest states include a tachyon 
and a massless vector meson (the Maxwell field or, if we include isospin, the 
Yang-Mills field): 


|0) 
Massless vector = a*,|0) (21.30) 


Tachyon 


(Historically, the tachyon was viewed as troublesome feature of the string 
model. However, one can also view it as a blessing in disguise, because it signals 
the presence of spontaneous symmetry breaking to a new, perhaps more physical 
vacuum. Also, the tachyon disappears when we generalize the theory to the 
supersymmetric string.) 

The series continues indefinitely. The next few states include a massive spin-2 
field and massive vector field: 


a" a’ ,{0) 


a", |0) (21.31) 


Massive spin — 2 field 


Massive vector field 


In Figure 21.2, we plot the resonances on a chart, with mass squared on the 
x axis and spin on the y axis. The linearly rising trajectories are called “Regge 
trajectories” with Regge slope a’. The point where the leading Regge trajectory 
hits the y axis is called the “intercept.” The important point is to observe that 
the massless Maxwell and Yang-Mills fields (with intercept one) are necessarily 
included as part of the string spectrum. For closed strings, the intercept is equal to 
two, so we necessarily have a theory of massless gravitons. (In the limit of zero 
slope, we see that only the massless particles remain. Thus, the zero-slope limit 
is a convenient limit in which we may retrieve point-particle field theory.) 


708 Superstrings 


Spin 2 


m2 


Figure 21.2. Linearly rising Regge trajectories for open strings. The resonances of string 
theory are states of arbitrarily high spin and mass. In the massless closed string sector, the 
theory necessarily includes quantum gravity. 


Next, we calculate the energy-momentum tensor of the system: 


1 6f 
ii = as 5g (21.32) 
This, in turn, can be shown to equal: 
1 
Tap = 8,X y, 8X" — 5 8ab8 OX" aX, (21.33) 


We notice several important features of the energy-momentum tensor; that is, 
it satisfies: 
ji Tete (21.34) 


Notice that the energy-momentum tensor forms a closed algebra. The Fourier 
modes of the energy-momentum tensor form the Virasoro generators L»°: 


1 : imo —imo 
eee al do [e (Too + Toi) +e” (Too — To1)] 


= ral do[e'""(X +X’) 


U 
SIO Ja 


iH] 


1 co 
5 Dy Mm—nGtn (21.35) 


where @, = ./|n|a, forn #0 and ap = V 2a’ p. They obey the algebra: 


c 


[Ln, Ln] = (n —m) Lgam + 12 


Sn, mn — 1) (21.36) 


where c is the central charge and equals the dimension of space-time. 
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(Another way to derive this algebra is to start with the Nambu—Goto action, and 
then construct P,, as 5% /5X,,. We find that the miomenta are not all independent, 
but instead satisfy an additional set of constraints: 


> 1 
ae) = 
Kt i (27 a@’)? Xi 0 
Px) (21.37) 


The moments of these constraints also form the Virasoro generators.) 

In the Gupta—Bleuler quantization scheme, the ghosts that propagate in the 
system (corresponding to the longitudinal modes of a“) can be eliminated by 
applying the gauge constraints directly on the Fock space. Thus, we apply: 


L,|R) 


0, n>O 


WW 


(ge ik) 0 (21.38) 


where the second condition is the mass-shell condition. After a rather tedious 
calculation, one can show that these conditions are sufficient to eliminate all 


unphysical states from the physical spectrum. However, there is an unexpected 
result: the spectrum is ghost-free only if the dimension of space-time is 26. 


21.3.2 Light-Cone Gauge 


As in ordinary field theory, we can alternatively formulate the system in the light- 
cone gauge’ where the unphysical longitudinal modes are eliminated from the 
very start. 

We will define the light-cone coordinates as: 


xt = Ta xX} 
= aa =X) (21.39) 
and fix the gauge as: 
Xt(,t)=p'*t (21.40) 


We can use this light-cone gauge to eliminate all nonphysical modes. By 
taking the constraints in Eq. (21.37), we can eliminate unwanted longitudinal 
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vibrations by solving for: 


. 1m me 
P'{(o) = 2p* (7+ 2 


xX (co) 


(21.41) 


ll 
ae 
Qu 
Q 

| 
aa) 
fas 


The Hamiltonian in the light-cone gauge reduces to: 


1 x2 
He | (P24 = do (21.42) 
2 0 A 


Notice that the physical Fock space consists of transverse harmonic oscil- 
lators, which are ghost-free. Of course, we still must check that the theory is 
Lorentz invariant. We do this by rewriting the Lorentz generators in terms of the 
independent transverse modes. This is a bit awkward, but straightforward: 


P14 
We | da (KUP” — x" pe) 
0) 
a 
= xp — x" pt —i D ee re (21.43) 


The surprising feature of the Lorentz generators is that, in general, they fail to 
close properly unless we impose one more constraint: 


[M, M~/]= pa (ai_,0/ — a! ,ai)A, =0 (21.44) 
where: 
n 1/D—-26 
A, = —(26 — _ = ; 
1! D)+~ ( D +2 2a) (21.45) 


where a is the intercept. In order to have Lorentz invariance, we must set A, 
equal to zero, that is, 


D= 76: -a= 1 (21.46) 


This is consistent with the result found in the conformal gauge, that self-consistency 
of the string theory forces the dimension of space-time to be 26. 
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21.3.3 BRST Quantization 


Likewise, the BRST method should also reproduce this result. To BRST quantize 
the string, we start with the invariance of the metric tensor: 


b8ab = Zac Opdv* + O,5v* Bey — OVS AcRab = Va Spy + Vp Sq (21.47) 


The BRST procedure begins with the construction of the Faddeev—Popov 
determinant, which is the determinant of the variation of the constraint. This, 
however, is just the determinant of the operator V,. The Faddeev—Popov deter- 
minant can thus be rewritten as: 


Arp = det(V,) = det V, det V; (21.48) 


We now introduce Faddeev—Popov ghosts by exponentiating this determinant: 
z = i if Raa he 
Arp = | Db Db De Dee (21.49) 


where: 1 
Gre - (b eed a.) (21.50) 
ww 


As usual, adding this ghost term to the original conformal gauge action in Eq. 
(21.20) yields a residual global symmetry, called the BRST symmetry, which can 
be generated by the BRST charge, which is computable from the Noether current. 
A straightforward calculation of the Noether current yields®: 


QO Gee (ef + air e aby.) 


n=—0o0o 


lle = ae > age + inca 


n=] 


— ; Se Com C2nOnamn + GE) (21.51) 
where: 
16n9 Om 1 Son (2132) 
and: 


14 /D ] 
Oe = 5 a (Fi —m)+ am = 13m?) ay 2am nm (21,53) 


which vanishes only if D = 26 and a = 1, as before. 
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The new Fock space now consists of all possible products of all possible 
creation oscillators, including the ghost oscillators: 


[| {4¢,}{o-mHe-p HO) (21.54) 


nm, p,[L 


Although the Fock space now has a vastly increased number of ghost states, we 
can eliminate all of them by applying the BRST operator onto the Fock space: 


Q|y) =0 (21.55) 


Thus, all three quantization programs can be shown to have the same physical 
spectrum if the dimension of space-time is 26. Also, the intercept condition forces 
us to incorporate spin-1 Maxwell fields for the open string and spin-2 gravitational 
fields for the closed string. 


21.4 Scattering Amplitudes 


Interactions are introduced by postulating that the string can break and reform an 
arbitrary number of times. The world-sheet corresponding to this is therefore the 
set of all two-dimensional complex surfaces with g holes or “handles,” as shown 
in Figure 21.3. (Two-dimensional complex surfaces are called Riemann surfaces.) 
In this way, we introduce Feynman-like diagrams in a first quantized formalism.’ 

These simple Feynman-like diagrams conceal a large amount of information. 
For example, if we carefully extract out the zero mass, spin-2 sector from these 
Feynman diagrams, we will reproduce all of Einstein’s theory of general relativity 
power expanded around flat space. 


2=a5 aan 
Q i) 
li 0) 
\ 4 


Figure 21.3. Strings can break and reform, thereby sweeping out two-dimensional Rie- 
mann surfaces of genus g. In this way, the string reproduces Feynman-like diagrams. 
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(0) 


Figure 21.4. The world-sheet for an N-point tree amplitude. For calculational purposes, 
we have conformally mapped the world-sheet onto this surface. Momenta from tachyons 
enter from the top of the world-sheet. 


Let .#{y.2; represent all conformally inequivalent Riemann surfaces with 
genus g and N “punctures” (external strings) located at infinity. Then the com- 
plete N point tachyon amplitude is therefore given by summing over all functional 


integrals defined over yo}: 
> / Dx i du 
g » AIN.g) 


N 
x exp |: [ss + Dr iti xo] 


An(ky, k2, +++, kw) 


a 


| 


N 
Ss / du (TI a (21.56) 
4 i=] 


AAW, g) 


where dy is aconformal measure on the Riemann surface. This is a generalization 
of the first quantized point-particle path integral that we analyzed in Eq. (21.9). 

Fortunately, for tree diagrams this amplitude is easily calculable. Because 
the theory is conformally invariant, we will find it convenient to perform the 
functional integral over the world-sheet corresponding to a long horizontal strip 
with momenta entering the world-sheet at points along the top (Fig. 21.4). To 
obtain the amplitude, we will then conformally map this strip onto the upper 
half-plane. The external tachyon lines will then lie on the real axis. 

To solve this functional integral, let us shift the integration variable by a 
solution to the classical equation: 


Xp = X ys, classical oie Xu (21.57) 
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where the classical solution is determined via the Green’s function for Laplace’s 
equation on the upper half-plane: 


X ws, classical = ia! f Gte,2/)Jte') 


G(z, z’) log |z — z'| + log |z — z*’ | (21.58) 


(This Neumann function is easily calculated by the method of images. The elec- 
trostatic potential at a point z in the upper half-plane is the sum of the contributions 
from a point charge placed at z’ and also the image charge placed in the lower 
half-plane at z*’.) 

With this Neumann function, the Gaussian integral can be performed, leaving 
us with the N-point tree amplitude!°-"*: 


ave fT Tas IG ieee (21.59) 


2<i<jf<Nn 


where the z; are on the real axis and obey: co = z; > 22 = 1 => 23°--Zn_-1 2 ZN = 
0. This is the scattering amplitude that describes the scattering of N tachyons. 
For N = 4, this expression reduces down to the celebrated Veneziano 
formula:!>:'6 
P(—a(s))P'(—e@(¢)) 


N 
= =s/2-2¢, _ yy—t/2-2 — 
bls. i) = i aie Xe (1 — x) Tenoeauen ai (21.60) 


where a(s) = 1455, a(t) = 1+ $t, 5 = (ki +k)’, and t = (kp + k3)’. 

The accidental discovery of this formula in 1968 by Veneziano and Suzuki, 
who were trying to describe the scattering matrix for hadronic interactions, marked 
the birth of what eventually became superstring theory. (They were originally 
trying to find a formula for the scattering of pions, using S matrix theory and 
finite-energy sum rules, when they stumbled across the Euler beta function, which 
satisfied almost all the properties of the § matrix except unitarity.) 

In practice, it is often more convenient to work with the operator formalism. To 
convert the path integral to the operator formalism, we need to make the transition 
from the Lagrangian formalism to the Hamiltonian formalism on the world-sheet. 
Then the path integral will be defined in terms of X” as well as its conjugate 
momentum P*, which is an operator. This transition is easily done, since the 
Lagrangian describes an infinite set of noninteracting harmonic oscillators. 

When we make the transition to the Hamiltonian formalism, the vertex e’*'* 
appearing in the path integral now becomes the operator expression: 


V(k) =: e*" := exp (: 5 5 Jew ( - bas (21.61) 


n=] 
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(We must normal order this vertex. This is because in the path integral for the 
N-point function we deleted the sum over = j, which diverges. To eliminate this 
divergent quantity in the operator formalism, we normal order the oscillators.) 

In the Hamiltonian formalism, the transition element between two string states 
is given by (X |e''|X’). If we make a Wick rotation, then the integrated propagator 
between two states becomes: 


I 
lo 1 


oO 
Dye | saa (21.62) 
0 
sandwiched between any two string states. The Hamiltonian on the world-sheet 
is given by Lo — 1. 
For the path integral describing the N-point amplitude, the transition to the 
Hamiltonian formalism gives us an expression for the N-point function!’: 


Aw = (0, ki|V(k2)DV (ks) ++» Vikw—1)|0, kw) (21.63) 


where |0, k) = |0)e'**, where x“ is the center-of-mass variable describing X“. 
To contract these oscillators, which are all written in terms of exponentials, we 
use the coherent state formalism. We define a coherent state by: 
p= I r"a'y"|0) = e*4" 10) (21.64) 
= mG ‘ 


n=0 


Then we have the identities: 


(ala) = ene 
x™4\,) = xa) 
ePa) =8 ee) (21.65) 


By contracting the oscillators, we reproduce the N-point amplitude in Eq. (21.59), 
as desired. 

From a physical point of view, the more interesting theory is the closed 
string theory, which includes Einstein’s theory of general relativity as a subset. 
Closed strings can also be quantized in much the same way. The only major 
difference mathematically is that the closed string contains two independent sets 
of oscillators, not just one. 

We can decompose the string variable in terms of two sets of commuting 
harmonic oscillators: 


a’ 1/2 co il ; Q 
Xe) = ge (5) > — (ane~""" + G,e'"” 
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Pu 1 ~ —ino ° ino 
— n{ — layne — 1a,e 
P,(o) D + on Joa ‘k Vn ( : 


Higher ime"") | (21.66) 


where X (0) = X,,(277). 
The Hamiltonian now also has doubled the number of oscillators: 


2n be oe) 
H= nf do («7 + at =) (nalan+na}a,)+a'p?, (21.67) 
0 n=) 


The graviton naturally emerges as the massless state with spin 2: 


Tachyon |0) 


Gravion = (a%,a",+4",a,)|0) (21.68) 


This, in fact, is perhaps the most attractive, and most mysterious, feature of string 
theory, that general relativity is necessarily part of the theory. While other point- 
particle theories try to avoid including the graviton, string theory views gravity as 
an inseparable part of its formulation. 

The propagator for closed strings is similar to the open string propagator, 
except for one difference: There is an extra rotation factor P that guarantees that 
the final result is not dependent on the origin of the parametrization. Thus, the 
propagator is: 


——>——_ P (21.69) 
Lo+Lo —2 


where: 
Qn ; re 
P= i dO ei (Lo—Lo) (21.70) 
0 


where Lo — Lo is the operator that rotates the closed string. The propagator can 
be written in an equivalent way: 


1 7 i my 
zlo—2slo—2 g2, = sint(Lo—Lo) (21-71) 


~ On |z|<1 (Lo — Lo) Pot Lo — 2 
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The N-point amplitude then becomes!*:!9: 


An = 9° (0, ki|V(k2)DV (ks) - +» V(kw—1)]0, kv) (21.72) 


perm 


where the permutation is taken over all possible orderings of the external legs. 
When expanded out, the resulting N-point amplitude for closed strings is almost 
identical to the one for open strings (except the z; variables are now integrated 
over all complex space, not just the real axis). 

The Virasoro constraints can also be written for the theory, which now become: 


Lid) = Liipea0 
(Lo+Lo —2)|@) = O 
(Lo —Lo)|\¢) = O (2173) 


where the last constraint is due to the fact that the states should be independent of 
where we chose the origin of our parametrization. 


21.5 Superstrings 


To make the spectrum more realistic, we must now turn to the superstring, which 
introduces a new symmetry: supersymmetry. In fact, supersymmetry, as a sym- 
metry of an action, was first discovered in 1971 in string theory, and only later 
was adapted to four-dimensional point particle theories. 

Let us introduce a new fermion field y,, the counterpart of X,, which is 
a vector in space-time but transforms as a two-dimensional spinor in the two- 
dimensional world-sheet. Then, the Neveu-Schwarz—Ramond (NSR) model?®:*! 
can be expressed as a two-dimensional action. Gervais and Sakita introduced the 
following Lagrangian”: 


ete! 
To 


& (8qX 4 9°X" — if" p* aaWy) (21.74) 


where: 


Ten 0 i 
0_ ie 1.75 
: (s aul p ae >) 
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and: 


n_( % pH = "9 21.76 
Ne (21.76) 
1 


with the metric {p, p?} = —2n*”, where n is diagonal and given by (—1, +1). 
Written explicitly, this equals: 


ce = [XK — XX! bivo(, +3p)¥o + ivi — Avi] 21.77) 


The important feature of this action is that it is explicitly invariant under the 
following supersymmetry: 


EX? =ey”, bw’ = —ip" aX" € (21.78) 
The energy—momentum tensor can be written as: 
Ta = aX, X" + 20" pa Anu + GV" Pr BaWu — (trace) (21.79) 
By Noether’s method, we can derive the conserved supercurrent: 
l 3 
Ja = 50" Pa" Xp (21.80) 
We can rewrite the superconformal current J, as: 
1 KB 
Tr = 5M n aX (21.81) 
and its Fourier moments as: 
AZ ns /2) 
G,n=2O—2z Tr(z) (21.82) 
201 
We quantize the fermionic oscillators in the usual way: 


{wi(o, t), Wp(o’, T)} = 25q,5(0 — 0')n*” (21.83) 


Because we have more fields, there are actually two different boundary condi- 
tions we may take on the theory, either periodic (R) or antiperiodic (NS) boundary 
conditions. The w; fields are equal to each other at o = 0, but at o = z they obey: 


R:_ Yo(7, T) 


NS:  wWo(z, 7) 


Wi(z, T) 
—y (x, T) (21.84) 
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With these boundary conditions, the harmonic oscillator decomposition is given 
by: ; 


1 = toi 

Ri v= Do denise) 
1 <= 

NS : Wo== Bree. te (21.85) 
Bi Prin 


where we associate the lower index 0 (1) with the + (—1) sign appearing in the ex- 
ponential, where the R states are integral moded, while NS states are half-integral 
moded. and where we have the anticommutation relation among oscillators: 


Ro: {d#,d¥}= "Sn —m 
NS: {b4, b’} = n¥8,_s (21.86) 


The Fock space of the theory now describes either an infinite tower of bosonic 
fields, or fermionic ones: 


R: [f(a }{d?, HO) we 


Ar 


NS: | [fat }{b", }10) (21.87) 


(MEN 


where u, is a 10-dimensional (32-component) spinor. 
The commutators and anticommutators of the energy-momentum tensor and 
the supercurrent now form a closed algebra, called the superconformal algebra: 


(oan oe = (m a Sere) oF sm a M)bm+n,0 
m 
[ems Gl = (F—1)Gmir 
AiG G; } = Pal bes a0 5) Tee 4 br+s,0 (21.88) 


where c = 2¢/3. 
An explicit representation of the NS superconformal operators is given by: 


ee eee. 
Lin va 2 DE > A_pAmen - ja 5) 7 ri er 


n=—oo —0oo 


G, 


De Ob (21.89) 


n=—0oO 
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For the R sector, the generators are given by: 


_ LS 1 
= ae yo ee er t+ yy (x + sm) SG ,Onen 


n=—Cco n=—oo 


I 


Ga (21.90) 
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Finally, let us define the operator Qgrs7. We find that the Faddeev—Popov 
ghost factor can be written in terms of two commuting ghosts B, y as: 


=- (Bay +c.) (21.91) 


where c.c. is the complex conjugate. 
The complete superconformal generators must also include the presence of 


the b, c and B, y ghosts: 


°° 
pene = ye (m +n): b= non + (5 m +n) E Bm—nYn : 
aes n=—0o 


co co ] 
Ghost = -—2 3 b_nVm+n + > (5" -m) C—nBman (21.92) 


=— oc 


Finally, Q can be written as: 


Q 3 Cares Gara) 3 (m — Nn): C_mC—nDman : 


n=—0oo 2 n=—co 
ar S (F oF m) C—-nB_mYman te 3s Y—-m¥_nOmen — aco 
m,n=—co m,n=—oCo 


(21.93) 


As usual, we can check for the vanishing of Q7, and we find the constraints: 


(21.94) 
0 (R) 


Although the NSR formulation is quite simple and easy to work with, one 
disadvantage is that ten-dimensional space-time supersymmetry (not to be con- 
fused with the two-dimensional superconformal symmetry of the NSR model) is 
not manifest. There exists another reformulation of this model, called the Green— 
Schwarz model,”* which introduces two genuine ten-dimensional spinor fields S! 
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and S* (which have 2!°/* = 32 components each). In this model, ten-dimensional 
space-time supersymmetry is manifest as a symmetry of an action. (However, a 
detailed discussion of this model is beyond the scope of this book.) The advantage 
of introducing these spinors S’ is that we can construct various superstring theories 
from them. 


21.6 Types of Strings 


At this point, we may ask what are the various types of string theories one can write 
that are supersymmetric, ghost free, and anomaly free. The easiest way to catalog 
the various possibilities is through the light-cone quantization of the GS string, 
since all ghosts have been removed and the theory is globally supersymmetric in 
space-time. 

The list of totally self-consistent superstring theories consists of: 


1; Type. 
2. Type IIA. 
3. Types. 


4. Heterotic. 


(At present, the leading superstring theory is the heterotic string. When we refer to 
the superstring theory, we are therefore implicitly referring to the heterotic string.) 

It may seem surprising that there are so few self-consistent string theories, 
while there are an infinite number of point particle theories. The reason for this 
is that the Feynman diagrams of a point particle are based on one-dimensional 
graphs, upon which we can impose any number of Lorentz covariant vectors 
and spinors with arbitrary isospin indices in our Feynman’s rules. However, 
the Feynman diagrams of string theory are two-dimensional manifolds, obeying 
sirict self-consistency constraints; so it is not surprising that we only find four 
self-consistent string theories. 


21.6.1 Typel 


The first string theory is called type I, which contains both open strings and closed 
strings. The two spinor fields S! and S * of the GS model have the same chirality. 
(Because the closed string emerges as a bound state of open string graphs, we 
must add the closed string sector to the open string in order to maintain unitarity.) 
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Gauge invariance can be added into the theory by multiplying the N-point 
function with appropriate traces over the generators of some Lie algebra (called 
Chan-Paton factors). The gauge group must be $O(32) in order to cancel all 
anomalies. 


21.6.2 Type LIA 


For closed strings, there are two ways to choose the chiralities of S' and S?. If we 
choose them to be of opposite chirality, then we have the type IA string. Type 
IIA closed string theory is appealing because it has no chiral anomalies from the 
very beginning (since the two chiral sectors cancel against each other). In the 
zero-slope limit, when only the massless sector of the theory survives, the theory 
reduces to the point particle N = 2, D = 10 supergravity theory. 


21.6.3 Type IIB 


For closed strings, if S! and S* have the same chirality, then we have the type 
IIB superstring. However, in the zero-slope limit, when we analyze the massless 
sector, we find that there does not exist any known covariant version of this theory. 
Its light-cone reduction is well defined, but its covariant precursor apparently 
cannot be written. (This may be because of our limited understanding of how to 
construct point particle supersymmetric theories in ten dimensions.) 

At present, it seems, however, that the type II string cannot describe the 
physical SU(3) ® SU(2) ® U(1) symmetry of our low-energy universe. By 
compactifying from ten dimensions to four dimensions, the type II string can 
introduce a wide array of symmetries, but none of them seems to fit the description 
of our world. 


21.6.4 Heterotic String 


The string theory that holds the most promise of describing the physical world 
is the heterotic string.”* While the type I string uses multiplicative Chan—Paton 
factors to introduce isospin symmetry, the heterotic string introduces isospin in an 
unorthodox fashion. We recall that the closed string has two sets of operators, a, 
and @,, which do not interact; that is, as the closed string propagates, it has right- 
moving and left-moving oscillator modes. The heterotic string splits these modes 
apart. The left-moving modes are purely bosonic and live in a 26-dimensional 
space labeled by X” which has been compactified to ten dimensions, leaving us 
with a compact 16-dimensional space. If we use the symbol X' (X’ ) to represent 
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the 10 (16)-dimensional space, then we have: 
xX" — {x!, x} (21.95) 


We will choose the compactified 16-dimensional string, labeled by X‘, to live 
on the root lattice space of an Eg © Eg isospin symmetry. Since Eg is a rank- 
eight Lie group, the heterotic string can be compactified so that its spectrum is 
Eg & Eg [or Spin(32)/Z>2], which is certainly large enough to permit a serious 
phenomenological investigation. 

However, the right-moving modes only live in a ten-dimensional space and 
contain the supersymmetric GS or NSR theory. When the left-moving half (con- 
taining the isospin) and the right-moving half (containing the supersymmetry) are 
put together, they produce a self-consistent, ghost-free, anomaly-free, one-loop 
finite theory, the heterotic string (meaning “hybrid vigor”). 

The action for the heterotic string is therefore: 


il 2a ¥ : 16 - 
i [af da Vo. wee +) aX! a°X! 4iSy (8; #95) 
Ana’ 0 = 


(21.96) 
where / = 1,2,..., 16 and is an isospin index and where we enforce the con- 
straints: 

1 
(0, — EO = 0. eS afl +y1,)S =0 (21.97) 


where y* = 27!/2(y° + y?). 

In the zero-slope limit, this theory yields ten-dimensional supergravity coupled 
to a super Yang-Mills gauge multiplet with Eg® Es local gauge symmetry. Clearly, 
we have enough symmetry to include the Standard Model and extract interesting 
phenomenology. 


21.7 Higher Loops 
There are three main aspects to superstring theory: 


1. Superstring perturbation theory. 
2. Superstring compactification and phenomenology. 


3. Nonperturbative approaches and string field theory. 


We will discuss each of them separately. 
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From the point of view of quantum field theory, superstring perturbation theory 
gives us entirely new, unexpected mechanisms by which to cancel the divergences 
found in quantum gravity. We find, for example, that the higher loops are not 
ultraviolet divergent at all (because the presence of the infinite tower of resonances 
acts, in some sense, like a Pauli—Villars regulator). The only problem comes from 
infrared divergences, which in turn can be controlled by using symmetry. 

To see this, we will only sketch the calculation of the single-loop amplitude, 
omitting many details. We will, as expected, obtain the Neumann function defined 
over a disc with a hole (defined in terms of Jacobi @ functions). To calculate the 
first loop amplitude for N external tachyons, we will simply trace over a series of 
vertices and propagators, using the coherent state formula: 


* / Pre lay(al=1 (21.98) 


Using Eqs. (21.65) and (21.98), it is now a simple, although tedious, matter 
to take the trace over a string of vertices and propagators: 


Aiisess / d® p Tr [V(ki)DV (ka) --- DV (ky) D] 
~ [ee flier’ 3 5 f adi 
x (AAV (ki)zf V(ka)28 --- Vek) zk A) |a) 
- 4n \'3 ki-k, /2 
= 2, Iayi-4 —48 { oe aks /2 
=f []esvertizenr® (rp) [1 boten. wt" 
. (21.99) 
where: 
vi = (2ri)logzz+++z 
Witt WU 
t = (2mi)'logw 
W = 2127°°°ZN 
GR = Zj+12Zi42 °° ° 2; (21.100) 
and where: 


g” |z| —_ m — m 
(ew) = exp( 8 See 


(i — w™)2 


21.7. Higher Loops 725 


—m(Im vj)? \ | O,(vji|t) 
= 2 eS fe 
exp ( ine ) 6(O|e) (21.101) 
This, in turn, can be written as: 
Ay = i d’t (im t)~? C(t) F(t) (21.102) 
F 


where: 


Clr) 


1 
4(5Im me es “(eae 


F(t) 


n Yam f | 1G wT LQ” (21.103) 
i<j 


The important point is that ultraviolet divergences are completely missing in 
this amplitude, which is astonishing because it contains the one-loop contribution 
from the graviton and an infinite tower of massive particles. However, the theory is 
infrared divergent, which corresponds to w — 1, orto the interior hole shrinking to 
a point. This infrared divergence can be eliminated when we go to the superstring 
theory. To see this, we will analyze the superstring single-loop amplitude. We 
simply present the result: 


fthestor (ctr 
i log |w| Xij 


i<j 
= i d?t(Im t)~? Fs(t) (21.104) 
F 
where: 
Fs(t) = (Im 1) sl []a Ga) (21.105) 
i<j 


This amplitude appears to diverge as t — 0. However, there is a symme- 
try that is protecting the amplitude from diverging. This symmetry is modular 
invariance,”> which is a global symmetry, a subset of conformal invariance. A 
modular transformation on the t variable is generated by: 


, att+b 
~ ct+d 


(21.106) 


where a, b, c, d are integers. 
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The divergence of string theory is similar to the divergence of gauge theory 
found by non-Abelian gauge theory. As in gauge theory, the path integral diverges 
because of an infinite overcounting of the gauge symmetry, which is eliminated 
by slicing the gauge orbit once. In string theory, the counterpart of slicing the 
orbit is to take one “fundamental” region of the complex plane. We can do this by 
taking the fundamental region to be: 


1 1 
=5 SRee= 5 


Fundamental region = { Imrt >0 (21.107) 
|c| > 1 


Under a modular transformation, the fundamental region can be mapped into all 
other points in the complex plane. Thus, the divergence is removed by taking one 
fundamental region and throwing away the rest. 

Multiloops have also be calculated in the string formalism.7°-7? The inte- 
grands of the multiloop amplitudes correspond to the Neumann functions defined 
over Riemann surfaces of genus g. Because of this close analogy with Riemann 
surfaces, we can see intuitively that the rather miraculous cancellation of all di- 
vergences at the first-loop level persists to all loop levels. We know, by conformal 
invariance, that we can isolate the divergence of each loop by “pinching” each 
hole separately. Thus, the same arguments we used in the single-loop cancellation 
can be used to show that the divergence of each “pinch” can be eliminated. 

Once we have eliminated the divergences associated with each hole separately, 
we still have to consider the subtle divergences associated with the multiple 
deformation of the topology of the surface, that is, when several holes collapse 
together. This is easiest to study in the light-cone gauge, where Mandelstam has 
eliminated all divergences of the superstring. 


21.8 Phenomenology 


One of the main problems in superstring research has been to find the true vac- 
uum of the theory, either perturbatively or nonperturbatively. Therefore, intense 
research over the years has been spent trying to catalog the various possible 
four-dimensional compactified strings. 

A few classes of these solutions include: 


1. Calabi-Yau manifolds,*° which are highly nonlinear, nontrivial manifolds 
studied by mathematicians. 
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2. Orbifolds,*'! which are certain manifolds which have fixed points on them 
(e.g., a cone is an orbifold). 


3. Free fermion/free boson solutions.22~34 


Unfortunately, we now know millions upon millions of possible string vacua. 
In fact, it is conjectured that the complete set of all possible string vacua is the 
totality of possible conformal field theories. 

Although there are an enormous number of possible four-dimensional string 
vacua, the surprising feature of string theory is that, with a few rather mild 
assumptions, one can come fairly close to describing the physical universe. Earlier, 
we saw that Kaluza—Klein theory was too restrictive to describe the physical 
universe. In particular, the Standard Model’s gauge group and complex fermion 
representations could not be accommodated. However, the string model, because 
it is not based on Riemannian space, does not suffer from these problems. To 
begin, let us make the following assumptions: 


1. The string has compactified down to a four-dimensional Minkowski space 
times a six-dimensional space: 


Mi — M4 ® Ko (21.108) 


where M, is a maximally symmetric space; that is, 


R 
Ryvap = 75 Sua8vp — 8upSva) (21.109) 


2. N = 1 local supersymmetry has survived the compactification down to four 
dimensions. 


3. Some of the bosonic fields in ten-dimensional superstring theory can be set to 
zero. 


The second assumption, in particular, yields very stringent constraints on the 
possible string vacua. The variation of a fermion ¥; transforming under N = | 
supergravity is given by: 

bv; =[€Q, wi] ~ Die (21.110) 


If supersymmetry is preserved, then the vacuum is annihilated by Q, and therefore: 


Q\0) =0 — (0|dy;|0) =0 (21.111) 
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In the classical limit, this means that dy; itself must vanish: 
6; ~ Die = 0 (21.112) 


This deceptively simple statement is quite restrictive, because it means that € 
is a generally covariant constant spinor. This, in turn, is only possible on a highly 
specialized set of six-dimensional manifolds. To find these manifolds, we must 
study the covariant derivative D;, which has the physical interpretation of being a 
covariant displacement operator on the K manifold. If we travel in closed loops 
in K space, then the effect of this is equivalent to taking multiple variations of the 
fermion, so we arrive at: 


e+€+A"[D;, Djle (21.113) 


where A’ is the area tensor of the loop. 
The statement that € is a covariantly constant spinor means that it is invariant 
under multiple displacements in K space, so that: 


[D;, Djle ~ RijuT"e =0 (21.114) 
In other words, this means that: 
R;; =0 (21.115) 


that is, the manifold K is Ricci-flat. 

On the manifold K, the displacement operator D; contains a connection field, 
which is an O(6) gauge field. However, we also know that O(6) = SU(4). 
Normally, a O(6) spinor has eight elements. However, this eight-component 
spinor can be decomposed according to SU (4) as: 


8=404 (21.116) 


that is, the eight-component spinor transforms as the sum of two four-spinors of 
opposite chirality. We will take € to have positive chirality, so it transforms as 
one 4. The fact that € is a covariantly constant spinor now reduces to the simple 
question: What is the largest group that will leave a constant spinor invariant? 
The answer is easy to see if we write the spinor as: 


0 
21217 
‘ (21.117) 
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Clearly, SU(3) rotations, which do not affect the first three rows of €, are the 
largest group that can leave € invariant. . 

In summary, we have shown that, with rather mild assumptions (namely, 
that NV = | supersymmetry survives the compactification), the manifold K is 
both Ricci-flat and has SU(3) holonomy. We call such manifolds Calabi-Yau 
manifolds. 

Next, we must check that the Bianchi identities are satisfied for the theory. 
Usually, these identities are trivially satisfied. However, for our case this is no 
longer true, especially when we invoke the third condition, that certain fields 
vanish. The Bianchi identities become: 


1 
a ge (21.118) 


This is a highly nontrivial constraint, because the Riemann curvature sits on the 
left-hand side, while the Yang-Mills field for the exceptional group sits on the 
right-hand side. 

However, there is a nontrivial solution to this constraint. This constraint 
essentially forces us to make a link between the Yang—Mills connection field and 
the connection field of Riemannian K space. Since we know that K is a Calabi- 
Yau manifold with SU(3) holonomy, we can insert the connection field of K into 
the connection field of Eg ® Eg. We know that Eg contains SU(3) ® Eg, as a 
subgroup. By preserving the SU(3) contained within Eg, we achieve a breaking 
of the original exceptional group symmetry, so that: 


Eg ® Eg — SU(3) © Esp ® Eg (21501) 


The original fermions of Eg © Eg, which formed a representation 248, trans- 
form under SU(3) © E¢ as follows: 


248 = (3, 27) © (3, 27) 6 (8, 1) @ (1, 78) (21.120) 


We can now place yet another restriction on the theory. The Bianchi identity 
cannot be satisfied with any choice of fermions in four dimensions. A careful 
analysis of the constraint shows that the fermions of the low-energy spectrum 
must belong to the 27, which is precisely the favored GUT representation for E¢. 

Finally, one great advance of this construction over previous ones is that 
we can determine the number of generations from purely topological reasons. 
In standard GUT theory, we recall, there is no compelling reason to introduce 
three exact copies of the theory. In superstring theory, we have an additional 
constraint coming from topology. By analyzing which manifolds allow fermions 
to propagate on them, this gives us a determination of how many generations 
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of fermions are allowed. In particular, there are several manifolds that allow 
precisely three generations of fermions on them. 

In summary, with very mild assumptions, we have found a vast number of 
solutions to the string equations of motion that mimic many of the features of the 
physical universe. 


21.9 Light-Cone String Field Theory 


So far, we have developed string theory in the first quantized formalism, where we 
postulated a large number of ad hoc rules to derive the S matrix. In this language, 
we could not prove unitarity or fix the weights of the various diagrams. In this 
section, we will derive the second quantized field theory of strings,* where all the 
Feynman rules are derived from a single action. We will first discuss the light- 
cone string field theory, where unitarity is manifest, and then the BRST string field 
theory, where Lorentz covariance is manifest. 

The field theory of strings is based on ®(X), which is a functional; that is, it 
is a function of every point X,,(a) along the string for all possible values of o: 


®(X,,) = ® (X,,(01), X,(02), ..., Xu(on)) Cilia 
where o; are the points along the string and we let N — oo. 


We can also decompose this string functional in any basis we wish. In the 
harmonic oscillator basis, the string field has a particularly simple form: 


®(X) = (X|P(x)) 21522) 
where: 
|(x0)) = 6(x0)|0) + Ay (xo)a4*|0) + gy v(xo)attatt lo) +--- (21.123) 
where Xo is the usual four-vector representing ordinary space-time. Here, we see 
the explicit decomposition of the field functional in terms of the tachyon field 
(xo), the Maxwell field A,,(xo), a massive graviton field g,,,(x0), etc. 
We can repeat our discussion in Section 8.3, where we made the transition from 


first to second quantization for point particles. We find that the free light-cone 
action is given by: 


S= | dt DX; dp* [®p+(X;, t) (9, — H) ®)+(X;, t)] (21.124) 
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Figure 21.5. In light-cone string field theory, there are five ways in which open and closed 
strings may interact. Notice that all interactions take place locally along the string at certain 
points. Closed string field theory has only a cubic interaction. 


where the light-cone Hamiltonian is given in Eq. (21.42) and ©,-+(X;, tT) is the 
Fourier transform of ®(X~ , X*, X;) with respect to X_ after we have taken the 
light-cone gauge X* = p*t. 

The measure DX; is equal to: 


joy. Ge TJ [@xn: =|[][[ex@ (21.125) 


i 


Next, we will only sketch how to write the interaction Lagrangian for the 
light-cone string field theory. In Figure 21.5, we list the five possible interactions 
that are required to describe open string field theory. (For a purely closed string 
field theory, only the cubic term is necessary.) For open strings, an examination 
of Figure 21.5 shows that strings can join at their endpoints (or break at an interior 
point). Among other terms, the interaction Lagrangian contains a ®? term, with a 
Dirac delta function sandwiched in between: 


3 3 
53 = / [ |: DKS (» r") 5103 ®1(X1)O2(X2)O1(X3) +h.c. (21.126) 
el r=] 
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where: 


2 
bi3=[] []  8(X3(3)-6Gray —0)X1(01)—0(6 — m0) X2(02)) (21.127) 


i=l 0<o0; <7; 


where the string variables are defined as: 


GC = G, 0<o <7a, 
07 = O-NQ, May <0 <1(a) +2) 
03 = (a) +a2)—<O, 0<o <m(a;+aQ2) (21.128) 


with the condition a a; = 0. To calculate all N-point functions and loops 
diagrams, it is necessary to use all five interactions shown in Figure 21.5. If we 
let 6 (W) represent open (closed) string fields, we can symbolically represent the 
interactions for the open and closed strings: 


seem $34 64467 + 0+ OU 
Faced = WO (21.129) 


In other words, the open string vertex function by itself cannot generate all string 
amplitudes, so we must necessarily include closed strings as well. Thus, even 
if we started out with an open string theory without any gravitons, we find that 
gravitons necessarily creep back into the theory. There is no choice: String theory 
is by its very nature a theory of quantum gravity. 

The purely closed string action, by contrast, is cubic. This is rather remarkable: 
The theory of quantum gravity, which is highly nonlinear, coupled to an infinite 
tower of spinning fields, is cubic. The Y? interaction is sufficient to generate all 
the interactions of gravity coupled to matter fields. 


21.10 BRST Action 


There is also a second quantized formalism in which gauge invariance and Lorentz 
invariance is manifest. Let us choose the BRST action: 


S= if DX Db Dc Db D2 OQ (21.130) 


where © is defined to have ghost number —}, and Q is the usual BRST operator. 
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The advantage of this second approach is that one can see explicitly the gauge 
invariance of the theory. The theory is invariant under: 


d@=QA (21.131) 


because Q is nilpotent. 

As we mentioned, the first quantized string theory required a sum over the set 
of all conformally inequivalent topologies. This conveniently concealed many 
difficult questions concerning how to place coordinates (or moduli) on arbitrary 
Riemann surfaces. The principal problem is that, until recently, mathematicians 
have been unable to triangulate moduli space successfully for genus g Riemann 
surfaces, even after a century of experience with these surfaces. Remarkably, 
string field theory gives an exact triangulation of moduli space, thus solving a 
long-standing mathematical problem. 

Let us begin our discussion by first requiring that open string field theory be 
a gauge theory that satisfies the axioms of gauge theory. Specifically, we need to 
postulate the existence of a derivative Q and a product operation +. We postulate 
the following five axioms: 


1. The existence of nilpotent derivative Q such that Q* = 0. 


2. The associativity of the * product: 


(A* B)*C=Ax*(B*C) Giel32) 
3. The Leibnitz rule: 
O(A * B)= QA*B+(—1)'41A * OB (21.133) 
4. The product rule: 
fave-cniie [xa (21.134) 
5. The integration rule: 
/ QA=0 (215533) 


where (—1)!4! is —1 if A is Grassmann odd and +1 if A is Grassmann even. 
We postulate that the field A has the following transformation rule: 


SAIONG At A= Aw A (21.136) 
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ya 


Figure 21.6. The symmetric interaction of Witten’s covariant open string field theory. 


Then we can construct a curvature form given by: 
F=QA+AxA (21.139) 
such that: 
6F=F*xA—Ax*F (21.138) 


It is easy therefore to show that the following is a total derivative: 


[ears fo(axoasiasasa) (21.139) 
Then the Witten action*® is given as a Chern—Simons form: 
2 
Z=AxQA+ sAxAxa (21.140) 


(The Chern-Simons form is preferable to the usual F? form found in ordinary 
gauge theory, because Q already has two derivatives contained within it.) 

Our task is to find a multiplication operation that satisfies the postulates of 
the * product. Then, the gauge invariance of the theory is automatic, without any 
more work. We notice, first of all, that the * operation is symmetric in all three 
strings. There is only one unique configuration that is symmetrical in all three 
fields, and that is given in Figure 21.6, where the midpoint of the strings has been 
singled out. 

The multiplication operation: 


|X3) = |Xi) * |X2) (21.141) 
simply means that we have exchanged the Fock spaces of strings 1 and 2 for string 


3, such that the points along | and 2 have been identified with points along string 
3. The triple product (without ghosts) can be defined as a delta function: 


Dt+Dx GD = [x DX DX3 B(X1)P(X2)P(X3) 
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3 
x pl r) (Xr,u(Or) ay Xy_1,p (0 == Or-1)) 
(21.142) 


(where we omit the ghost delta functions). 

Let us now write the ghost number for all the operators in the theory. The c 
ghost has ghost number 1, the b ghost has ghost number —1, so that Q has ghost 
number |. This, in turn, fixes the ghost number of the A field to be -}, since the 
action contains a term (A|Q|A), which must have total ghost number 0. 

The ghost number of the gauge parameter A and the * operation can be fixed 
by observing the gauge variation of the A field. In order for the left-hand side 
(with ghost number —3) to equal the ghost number of the right-hand side, the 
ghost number of A must be —3 and the ghost number of the * operation must be 
+3. 

Similarly, we can fix the ghost number of the if operation by demanding that 
the action have total ghost number zero. Putting everything together, we have the 
following set of ghost numbers: 


CE ile see 3 
be ae f a3 
: (21.143) 
Oo: LG he -3 
A: = 


What is more interesting, of course, is a covariant closed string field theory. 
Unfortunately, it is more complicated, requiring a nonpolynomial action where 
the closed string interactions have the topology of polyhedra.*’~*? In this short 
chapter we are unable to present this action or other interesting features of the 
superstring theory. We could only sketch the highlights. The interested reader is 
therefore urged to consult the literature concerning the many fascinating properties 
of superstrings that are beyond the scope of this book. 

In summary, the advantages of the superstring theory are: 


1. The theory is finite to all orders in perturbation theory. It requires no renor- 
malization. 


2. The theory necessarily includes quantum gravity and gauge theory as subsets. 
Dropping quantum gravity from the action, in fact, destroys the properties of 
the theory. 


3. The theory contains all the symmetries so far found in quantum field theory 
as a subset, yet it is totally free of anomalies. 
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4. With only a few assumptions, one can obtain the chiral fermion spectrum 
contained within the 27 of E., which includes all the known fermions. 


5. The generation problem can be formally solved by analyzing the topological 
invariants of a six-dimensional manifold. 


6. The model is so tightly constrained that only a handful of self-consistent string 
theories is possible. 


However, we should also mention the formidable problems facing superstring 
theory: 


1. As with any theory of quantum gravity, the superstring theory is defined at the 
Planck energy, and hence testing the superstring theory becomes problematic, 
if not impossible. 


2. Millions of vacua for the theory have been found, some of which have three 
generations of fermions and can reproduce many of the features of the Standard 
Model. However, the outstanding problem is finding which one, if any, is the 
true vacuum of the theory. 


3. Experimentally, the theory cannot explain why the cosmological constant 
is extremely close to zero. Supersymmetry, before symmetry breaking, is 
powerful enough to fix the cosmological constant to be zero. However, once 
supersymmetry is broken, it is not known how to keep the cosmological 
constant zero. 


Of these various problems, the most fundamental is perhaps the second. Until 
the true, nonperturbative vacuum of the theory can be isolated among the millions 
that have been discovered, the theory has no real predictive power. However, 
since the superstring equations are perfectly well defined, the true nonperturbative 
vacuum solution can, in principle, be found. Thus, the main problem facing 
superstring theory at present is theoretical, to isolate the true vacuum of the theory, 
rather than experimental. Until this solution is found, our attitude is to treat the 
superstring theory as a highly sophisticated theoretical laboratory in which to test 
the limits of quantum field theory. 


21.11 Exercises 


1. Show that Veneziano amplitude in Eq. (21.60) satisfies all the properties of 
an S matrix, except for one; that is, show that it is analytic, Lorentz invariant, 
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CPT invariant, crossing symmetric, and Regge behaved: 

A(s,t) > s* (21.144) 

for large s and fixed t. Why does the Veneziano formula not satisfy the last 


remaining constraint of the § matrix, unitarity? 


2. Using harmonic oscillators, expand out the N-point amplitude in Eq. (21.63) 
using coherent state methods and show it to be equivalent to Eq. (21.59). 


3. Express the N-point Veneziano formula so that it is manifestly invariant under 
a real projective transformation performed on the integration variables: 


, azt+b 
di i 


14 
cz+d a 


where ad — bc = 1. 


4. For the Nambu—Goto string, calculate the momenta P,, and prove that it 
satisfies Eq. (21.37). 


5. Given the BRST operator Q, show by direct calculation that it is nilpotent 
only in 26 dimensions; that is, prove Eq. (21.53). 


6. Show that the condition Q|®) = 0 is sufficient to eliminate the longitudinal 
mode of the Maxwell field. 


7. For the string field given in Eq. (21.123), show that the variation: 
5|®) = L_,|A) (21.146) 


contains within it the gauge variation of the Maxwell field: 5A,, = 0,,A. 


8. Show that the field variation: 
6|) = L_;|A) + L_,|A) (21.147) 
yields, for the spin-2 field: 
bhyy = O,Ay + OA, (21.148) 


9. In the commutation relation for the Virasoro algebra in Eq. (21.36), explic- 
itly show where the c-number term comes from. (Hint: take the vacuum 
expectation value of the Virasoro algebra.) 
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10. Consider the state: 
|v) = (L-2 + aL*,) |9) (21.149) 


where |) satisfies the Virasoro constraint. Show that this state does not 
couple to real states that satisfy the Virasoro constraint; that is, show that it 
is spurious. Now demand that this state also be real, that it also satisfy the 
Virasoro constraint: 


LilW) = L2|v) =0 (21.150) 


Show that this fixes D = 26 and a = 3/2. At first, this may seem to be a 
disaster: We have constructed a real state that is also spurious. But show that 
this state has zero norm, and hence the theory still makes sense at D = 26. 


11. Calculate the four-point function for the scattering of four tachyons in the 
Neveu—Schwarz model. Show that: 


A4(s, t) (0; ky |ki - ea 2 Vhs) - b_1/2|0; ka) 


rd —-a(s)rda ae 


T (1 — a(s) — a(n) (21.151) 


where V =k, "Vo, where a(s) = 1+ a's and ak? = 1. 


1 1 
exp (52:44) exp (54 Bija; |0) 


1 
= det '/*(1 — AB) exp] =a, ‘a, ( ) a} | |0) 
; Ee sae 


(212152) 


12. Prove: 


[Hint: use Eqs. (21.65) and (21.98) by contracting onto coherent states. ] 


13. Prove the L_,, Lo, and L; generate the group SL(2, R) (the set of 2 x 2 real 
matrices with unit determinant). This is also called the projective group. 


14. The modular group, which is the symmetry of the one-loop string amplitude, 
is generated by: 


ay (218153) 
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1S. 


16. 


Show that the group generated by these ii is equivalent to the group 
of transformations: 


es lat (21.154 
ctt+d a 
where a, b, c, d are arbitrary integers. 
For the open bosonic string, prove: 
(L, —Lo—n+1)Vo = Vo(L, — Lo +1) 
1 1 
L, —-Lo+I)————- = > (LL, — Lo —n +1) (21.1 
( ot DE = Tyga tle Lo +1) 21.155) 
From this, prove that: 
(L, — Lo —n+1)VoD--- DVo|0) =0 (21.156) 


Show that this means that ghost states do not couple to trees, although they 
can couple to loops. How does this compare with the way Yang-Mills ghosts 
couple to trees and loops? 


Prove that Eq. (21.10) is equivalent to the original Lagrangian in Eq. (21.1) 
by functionally eliminating p, and A. 


Appendix 


A.l SU(N) 


From the work of Lie and Cartan, we have a complete classification of the various 
compact Lie groups. If we restrict ourselves to compact, real forms, then the 
complete set is given by the infinite series, labeled: 


A, = SU(n+1) 

B, = SO(2n+1) 

C, = Sp(2n) 

D, = SO(2n) (A.1) 


as well as the exceptional groups, labeled by E¢, £7, Ex, F4, and G2. 

Of special interest to physicists is the Lie group SU(N), which is the set of 
all special, unitary NV x N complex matrices. If U is amember of SU(N), then it 
satisfies: 


UU't = 1 


det U 1 (A.2) 


By counting the constraints in this equation, we know that the matrix has N* — 1 


unknowns, or parameters. 
Any unitary matrix can be represented by the exponential: 


U =el! (A.3) 
where H is Hermitian: 
H'i=H (A.4) 


(This can be proved by taking the conjugate of both sides of the equation.) 
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Since there are N? — 1 independent Hermitian N x N matrices, we can also 
write: 


Neal 
U =iexp ( 2 ost) (A.5) 
i=l 


where t? are independent Hermitian matrices, the generators of the group, satis- 
fying: 


(e230) =i” (A.6) 


To create irreducible representations of SU(N), we first postulate the existence 
of N complex fields ¢' that transform as: 


go" =Uig! (A.7) 


We also introduce a new field y that transforms as y* — w*U'. Then an 
invariant is given by: 


Invariant: yd! (A.8) 


This is easily shown to be an invariant, because the transformed object contains 
U'U sandwiched between the two fields. Since U'U = 1, we see that y*¢/ is an 
invariant. 

In fact, we can use this as an alternative definition of the group; that is, SU(N) 
consists of all complex transformations with unit determinant that leave y*¢' 
invariant. That is, 


vie = (WRU) (Ui¢') 
vi (UtiU!) ¢ 
vid! (A.9) 


Notice that, unlike the case of O(N), the placement of the indices in the 
upstairs or downstairs position is extremely important, because the location of the 
indices indicates whether the vector transforms under U or under Ut. 

The ¢' transform according to the fundamental representation of the group. 
The name is appropriate because we can derive the higher representations by 
taking tensor products of the fundamental representation. 


A.2. Tensor Products 743 


Higher tensors transform exactly like the product of various fundamental 
representations: 


Tiieion (A.10) 


tyforetny 


where it is important to keep track of the upstairs and downstairs indices. 


A.2 Tensor Products 


In general, such tensors are reducible. To find the irreducible representations, we 
must take symmetric and antisymmetric combinations of the indices. 

This tedious process of taking symmetric and antisymmetric combinations 
is made simpler by noticing that there are two genuine constant tensors for the 
theory: 


OH gions €iji2---in ey 


To prove that these are genuine tensors, we simply act on these tensors with U 
matrices. As inthe case of the group O(N), 6” can be shown to be a constant tensor 
because U is unitary. Also, €''?”'" is a genuine tensor because the determinant 
of U is equal to one. 

For example, the tensor product A‘ B/, composed of two vectors, is reducible. 
To create smailer subsets that transform among themselves, let us take the sym- 
metric or the antisymmetric combinations of A‘ B/. We can write: 


a, inte...) eee 
Ppl = 2A pili 2 AG py 
AB 5 + 5 
A¥B/) = A'B/ — A/B' 
AUR? = AB SAB (A.12) 


For SO(2), let the symbol 2 represent the two elements of a vector: 
2=A! (A.13) 


In Eq. (A.12), there is only one element in the antisymmetric combination, which 
we represent symbolically as 1, and there are three elements in the symmetric 
combination (one of which can be separated out as the trace). Then a shorthand 
notation for Eq. (A.12) is given by: 


2@2=201601 (A.14) 
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It is easy to show that the symmetric and antisymmetric tensors, by themselves, 
form a separate representation of O(2), thereby proving that the tensor product 
A' B/ is reducible. 

To construct irreducible representations of 5O(3), it is useful to know that 5/ 
and e'/* are covariant tensors, and that we can take reducible representations and 
extract out irreducible tensors from them. 

As before, we can take the product of two vectors A’ and B’, each of which 
transforms as a triplet 3, and extract irreducible tensors. For example, we can 
extract the singlet A’ B! and the triplet «'/* A/ B* from the product. 

In general, the product of two triplets can be reduced according to whether 
they are symmetric or antisymmetric. The symmetric combination is represented 
as a 5 plus the trace 1, while the antisymmetric combination is represented as a 3, 
so we have two equivalent ways of representing this: 


-.. ile eee Lge 
Bi = ~AG BID 4 Al BI 
A'B a Bs 
3@3 = 56163 (A.15) 


Similarly, we can construct the irreducible representations of SU (3) by taking 
tensor products of the fundamental representation: 


3@3=603 (A.16) 
We can also take the combination y;* times @/, which reduces to: 
3@3=8601 (A.17) 


where 1 is represented by y;"¢'. 
For SU(N), this identity can be written as: 


N@N=(N’-1)061 (A.18) 


For more complicated tensor products, taking tensor products becomes rather 
tedious, so we use the method of Young tableaux. Let the box symbol represent 
¢'. If we have the product of two vectors and take the symmetric product, we 
have @“ ¢/), which is represented by two horizontal boxes. 

In general, n horizontal boxes means that we have an 7 rank tensor such that 
the indices are symmetrized. 

The number of independent components within this horizontal array of boxes 
is given by: 


(A.19) 


N+n-—-1 ENGNG 4 — 1) 
n - n! 
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Figure A.1. In this diagram, we see that the product of a quark and an antiquark gives us 
an octet and singlet, and that two quarks give us an antitriplet and a sextet. 


When we have two boxes stacked vertically, this means that we are taking 
a second-rank tensor and then taking the antisymmetric combination of the two 
indices. In general, m boxes stacked vertically means that m indices are antisym- 
metrized. The number of independent elements in such a vertical array is given by 
N elements taken m at a time. Thus, the dimension of m boxes stacked vertically 


i ll 


Figure A.2. In (a), a series of horizontal boxes corresponds to taking the symmetrized 
tensor product of n vectors. In (b), a series of vertical boxes corresponds to taking the 
antisymmetrized tensor product. In (c), we have a mixed tensor, with symmetrization for 
horizontal boxes and antisymmetrization for vertical ones. 
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is given by: 


eS (A.20) 


m m! 


For example, in Figure A.1 we the Young tableaux for Eqs. (A.16) and (A.17). 
In Figure A.2, we see more general types of young tableaux. 

Of course, this process cannot be continued indefinitely. For SU(3), for 
example, three boxes stacked vertically only have one element: 


1 = €;,T'!* (A.21) 


Therefore, for SU(N), N boxes stacked vertically corresponds to a tensor with 
only one independent element, denoted by 1. 

Also, notice that N — | boxes stacked vertically has N elements. This state 
corresponds to ¢;*, which also has N elements. If we add one more vertical box 
to N — | vertical boxes, then we get a scalar. Similarly, if we contract ¢;* with ¢', 
we also get a scalar. 

By convention, an arbitrary mixed tensor consists of a series of boxes stacked 
both vertically and horizontally. Let f; equal the number of boxes stacked hor- 
izontally in the i row. We take the convention that f; > fj41; so the number of 
horizontal rows diminishes as we go down the Young tableaux (see Fig. A.2). 
An arbitrary Young tableaux can therefore be designated by a series of numbers 
(fi, fo, ... f), with each number representing the number of horizontal boxes in 
each row. 

For example, a series of n horizontal boxes is designed by (n,0,0...). A 


series of m boxes stacked vertically is given by (1, 1, ..., 1) with m entries. 
Then there is a classical theorem from group theory that the dimensionality or 
number of independent elements in the mixed tensor (fi, f2,..., f,) is given by: 


D(fi, fa. -.-. fx) ee +h — fA)d+ f= fs)---d+A) 
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A3 SU(3) 


We can repeat many of the same steps for SU(3) using ladder operators. Notice 
that there are two generators that commute among each other (because they are 
diagonal): 


Bars YS Vae (A.23) 


This means that we can simultaneously diagonalize both operators, and that the 
eigenstates of these operators are indexed by two numbers, the ordinary isospin 
and the hypercharge. 

We therefore have states labeled by their eigenvalues: 


t3|t3, y) 
y|t3, y) (A.24) 


T3|t3, y) 
Y \t3, y) 


As with SU (2), we will now introduce the ladder operators of SU (3): 


ay Silo, Une foSiry, Ve=fpei ks (A.25) 


The new commutation relations now become: 


(73, T+] = £T+ [Y, Tz] =0 

[73, Uz] = FU4/2 [Y, U4] = £U4 

[73, Vi] = £V5/2 [Y, V4] = Va 

ile —_C_ [T,, U.] = Vs CD 
(i Vea = (i, 7) =0 

[te 2) =273 [U,, U_] = G/2)Y — T; 
[V.,V-J=G/2Y+T; [T,V.]=0 

[T,, U_]=0 [U,, V.] =0 


By examining the commutators carefully, we see that T, raises the eigenvalue 
tz; by one unit, and T_ lowers it by one unit. Since T, commute with Y, they 
leave y the same. We also see that U, lowers f; by one-half unit, and raises y by 
one unit. Likewise, V, raises t; by one-half unit and raises y by one unit. 

Graphically, we can represent this in eigenvalue space by plotting ¢; horizon- 
tally and y vertically. Then the action of the ladder operators is to raise or lower 
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Figure A.3. The ladder operators 7+, Ui, and V+ change the eigenvalues of a state in the 
direction of the arrows shown in this chart. 


the 


various eigenvalues along the horizontal, vertical, and diagonal lines, as in 


Figure A.3. 


By hitting an eigenstate |t;, y) with these operators Ui, Vi, and Ty, this 


eigenstate is converted into an eigenstate that lies one step removed from the 
original state, according to the prescription given. 


In general, all the Lie groups can be analyzed in this fashion via the ladder 


operators. Given the generators of an algebra, we can divide them into two types 
of generators: 


l. 


The Cartan subalgebra, consisting of the generators H;, which all mutually 
commute among themselves: 


The number of such generators in the Cartan subalgebra is called the rank 
rof the group. [For example, SU(3) is a rank-two group because its Cartan 
subalgebra consists of 7; and Y.] 

Then we can simultaneously diagonalize the members of the Cartan algebra, 
so that an eigenvector of these operators is given by: 


TEE eee (A.28) 


. The ladder operators, which move the various eigenvalues of the eigenvector 


by various amounts: 
Lio lligsceyli, «2 <)itiagpeertcelieee) (A.29) 
Since each ladder operator changes the eigenvalue of the state, we can label 


each ladder operator by a vector a in the space of eigenvalues, which is called 
root space. 
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Then, by taking successive products of the various ladder operators acting on 
a state of highest weight, we can fill out any representation of the group. 


In this way, we can systematically exhaust all possible representations of all 
possible Lie groups. 


A.4 Lorentz Group 


Because O(4) = SU(2) ® SU(2), we can also categorize the irreducible represen- 
tations of the Lorentz group using two-component, complex spinors belonging to 
SO). 

We can decompose the four-spinor as: 


_{ Ur 
y= ( Wy ) (A.30) 


We can then take the two-spinors as: 


(1/2, 0) 


b= 
> 


(OD 1/2) y (A.31) 


We can then construct higher spin fields by taking tensor products between 
the spinors. For example, vectors can be constructed by taking the product of two 
spinors: 


Vector: (1/2,0)@ (0, 1/2) = (1/2, 1/2) (A.32) 


A spin 3/2 field can be represented in several ways, but the most common is 
to take the product of a vector and a spinor: 


(1/2, 1/2) ® (1/2, 0) = (1, 1/2) @ (, 1/2) (A.33) 


In more familiar language, this corresponds to constructing a four-spinor with a 
vector index attached: 


/ 
Spin 3/2: Wy= ( a (A.34) 
Lb 
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Then the (0, 1/2) spinor corresponds to contracting the spin 3/2 field with a 
gamma matrix: 


O.1/2 ah, (A.35) 


The (1, 1/2) field then corresponds to a spin 3/2 field that has zero contraction 
on a gamma matrix: 


(1, 1/2)= vy -—U/Nyr' Ww (A.36) 
Similarly, a spin-2 field can be represented as the product of two vectors: 
Spin2: (1/2, 1/2) ® (1/2, 1/2) = ((0,0) ® C1, Ds S(O, 1) @ (1, O)Ja (A.37) 


where S(A) represents a symmetric (antisymmetric) combination. 
In more familiar language, this means that we can take the symmetric or 
antisymmetric combination of a second-rank tensor: 


1 1 
Suv = 5 Su) Su 57 lw] (A.38) 


where the parentheses (brackets) represent taking the symmetric (antisymmetric) 
combinations. 
Then we can extract out the trace part of the symmetric tensor: 


i v 
Suv — qoev8y 


Bhi (A.39) 


Il 


di, 1) 
(0, 0) 


Thus, (1, 1) corresponds to a traceless, symmetric second-rank tensor, which we 
adopt as our definition of the spin-2 field. 

We can go to higher and higher representations, but there is eventually a 
problem: A theory of interacting massless spin 3 particles does not seem to be 
consistent. 

Finally, we remark that it is customary to decompose the Lorentz group into 
various pieces, depending on the sign of certain parameters. 

For example, we saw that we could take detA = +1. If we only take the 
det A = 1, we have the proper Lorentz transformations, forming the subgroup 
SO(3, 1); that is, the Lie group is special; the determinant is equal to 1. The group 
is called the improper Lorentz group if det A = —1. 

From the definition of the metric, we know that: 


Spv = eA (A.40) 
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Taking the 0 — 0 component of this equation, we arrive at: 
1 = (A$)? — (Ai)? (A.41) 
so that: 
(Ap)? 21 (A.42) 
We thus have the orthochronous Lorentz group, with ne > 1, or the nonortho- 
chronous Lorentz transformation, with Aj < —1. 


Thus, there are four ways in which we can decompose the Lorentz group, 
depending on the sign of det A and A: 


Proper orthochronous : detv=4 Ape ad 
Improper orthochronous : devA= —1soAt ee OP 
(A.43) 
Improper nonorthochronous: detA=—-1 A?<1 T 


Proper nonorthochronous : det A= 1 ie =k TD 2s 


For example, ordinary rotations and boosts (which can be smoothly deformed 
back to the identity) are part of the proper orthochronous Lorentz group. 

A parity transformation x! — —x' belongs to the improper orthochronous 
Lorentz group. Time inversion t — ¢ belongs to the improper non-orthochronous 
Lorentz group. Full inversion x* — —x*, which is the product of a parity and 
time inversion, belongs to the proper nonorthochronous Lorentz group. 


A.5_ Dirac Matrices 


Independent of any representation, the Dirac matrices obey a number of identities 
that follow from the definitions: 
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(A.44) 


They also obey the following trace identities: 


Tr Ge Seis veo) 
Teyy) 
yy yy) 
Tr(y?y"y’y’y?) 
Tr(@; d2--- dan) 


0; nodd 

4gh® 

Aliases eee hii wae) 

ae 

a, - a2 Tr(@3--- don) — a1 - a3 Tr (do d4--- don) 


see ++ ° Gon Tt (do >> + Gon=1) (A.45) 


Under Hermitian conjugation and charge conjugation, the Dirac matrices obey: 
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Cysc~' = ye 
Caney = —o), 
CysyuC' = (ys¥u)" (A.46) 


Let us now specialize to specific representations. The most common is the 
Dirac representation, which has four complex components: 
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Under the Lorentz group, the Dirac representation is reducible. Each Dirac 
representation can be split up into two smaller representations. We can take 
the chiral projection, which gives us the Weyl representation for left-handed and 
right-handed spinors: 


—io* 0 
= A.48) 
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We can also take a purely imaginary representation of the spinors, given by 
the Majorana representation: 


& Oe 5 a 0 
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a! 0 
0 =i) 
We define conjugate spinors by: 
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w= pty® 
uo= uly? 
bd = vy? (A.50) 


On-shell, the spinors uv and v represent the electron and positron wave function. 
They obey: 


(g—m)u(p) = 0 
Y+m)v(p) = 0 
u(p\(y—m) = 0 
v(p\y~tm) = 0 (A.51) 


The spinors u and v also obey a number of normalization and completeness 
relations. They are normalized as follows: 


Il 
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u(p, s)u(p, s) 
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v(p, S)u(p, S) (A.52) 
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These spinors obey certain completeness relations: 


ye Ug(P, S)iig(p, 8) — Va(p, S)dg(p, S) = dup 
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If we sum over the helicity s, we have two projection operators: 


[A.(p)lap =). ual, 8)ip(p, 8) = (E*) 
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These are projection operators, and hence they satisfy: 


A.6 Infrared Divergences to All Orders 
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(A.53) 


(A.54) 


(A.55) 


(A.56) 


(A.57) 


(A.58) 


Although we have proved that infrared divergences can be eliminated to lowest 
order by adding the bremsstrahlung diagram to the vertex correction diagram, we 
would now like to generalize our result to all orders in perturbation theory. At 
first, this may seem like an impossible task, since there are an infinite number of 
ways in which the infrared divergence enters into various Feynman diagrams. 
However, the problem is actually tractable for two reasons. First, only a 
small subset of all possible Feynman diagrams actually contributes to the infrared 
divergence. We will therefore only concentrate on those diagrams where the 
emitted real photon is attached to the initial or final electron leg, which contributes 
to the infrared divergence when they are on the mass shell. If the photon has a 
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small momentum gq; and is emitted from an on-shell electron with momentum p, 
then the Feynman propagator contains a factor: 


1 1 


$$$ sn A.59 
(p+qiy? —m+ie 22p-q tie ( ) 


For small g;, we see that we have an infrared problem. (Photons attached to 
internal electron lines, or electron lines which are far off the mass shell, will not 
contribute.) 

Second, there are remarkable identities that make it possible to show that all 
these divergences cancel exactly. The calculation to all orders is not difficult 
once we realize that it is possible to sum the contribution of the real and virtual 
photons into an exponential function. Let the contribution of the emission of each 
real photon contribute a factor R, while the contribution of integrating over each 
virtual photon contributes V. Then the contribution of summing over arbitrary 
numbers of real and virtual photons, we will show, conveniently sums up to an 
exponential, given by: 


do do 
= R 2V A.60 
= (5a), exp (A.60) 


Before, we found that the integration over the real photon contribution is taken 
from jz to some detector sensitivity energy Ep, and hence yields a factor of 
log(E?/?). The integration over the virtual photon contribution is given by 
an integration over the four-momenta, which yields log(—q*/2). Since we are 
taking the exponentials of these two divergent factors, the log z cancels perfectly, 
and the final result is convergent. 

To begin the process of summing over photon lines, let us analyze a process 
where we have an electron coming in with momentum p and scattering off with 
momentum p’. If there were no infrared divergence problems to worry about, the 
contribution of this diagram would be of the form a(p’)Ou(p). However, because 
of the infrared corrections, we must calculate the contribution due to the emission 
of real photons and the integration over virtual photons. 

To perform the calculation, it will be convenient to insert a large number of 
photons radiating from the electron line with momenta q;, as in Figure A.4. 

Our job will be to calculate how to attach these various photon lines in various 
ways in order to perform the summation over real and virtual photons. 

To calculate V, the contribution of the virtual photons, we will pair off these 
photon lines, in arbitrary order, and then perform the integration over the virtual 
photon’s momenta. Then we must perform the summation over all possible 
pairings. To calculate R, by contrast, the contribution from the emission of a real 
photon, we will sum over the photon polarization and integrate over the photon 
momenta. 


A.6. Infrared Divergences to All Orders 757 
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Figure A.4. N-point graph for the emission of soft photons, which has an infrared diver- 
gence. 


If we examine, using Feynman’s rules, the sequence of propagators in the 
figure (with photon legs attached near the top of the diagram), we find: 
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This Feynman diagram is not as hopeless as it may seem, especially when we 
assume that each q; is small. First, consider the photon line with momenta q; that 
is near the emitted electron, with momentum p’. We can significantly simplify 
the numerator by shoving all momenta 7’ to the left, where they hit #(p’), and 
then we can use the Dirac equation. Since all p’ can be reduced to m, the only 
possible tensor left-over is p’“; so the numerator simply becomes the product of 
p’”. Similarly, the denominators can be simplified. 

For small g;, the product of Feynman propagators becomes: 


(A.62) 
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The next task is to sum over all permutations of the g; appearing in the 
product. Although this may seem difficult, we can use a formula that simplifies 
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this calculation enormously: 


1 1 1 
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where we sum over all permutations of the g; appearing on the left hand side of 
the equation. (The proof of this formula can be carried out by induction.) 

In summary, we are now left with a very simple expression, with each photon 
contributing a factor of p’“'/p’ - q; to the product. Not surprisingly, the same 
process can be carried out for photon legs attached to the bottom half of the 
diagram. Then we shove all 7 to the right and use the Dirac equation. Then the 
Feynman diagram reduces to a product of p/p - q;. 

Now let us sum up the contribution of all such diagrams, in any possible 
order. The photon with momentum gq; can be attached to the initial or the final 
electron. Thus, the photon with momentum gq; contributes two factors, depending 
on whether it attached to the initial or final electron leg. Since this photon can be 
attached to either leg, the correction factor for all the various photons is simply: 


e"ii(p')Ou(p) | | QO" (ai) (A.64) 
il 
where we define: 
Ih; Mi 
Ov(g;) = ( — rr.) (A.65) 
PD: Wi P°4i 


At this point, we must now begin the contraction process on the various photon 
lines. Let us say that there is a total of 2N + M photon lines. We will contract 
and integrate over N pairs of photon lines in order to calculate the contribution 
of virtual photons. The remaining M photons will be emitted as real photons, 
contributing to bremsstrahlung. 

To calculate the virtual photon contribution, we must pair off the photon 
legs, insert a photon propagator for each of the N pairs, and then perform the 
integration over d*q. Each contraction of a pair of virtual photon legs contributes 
the following factor: 


ie Sf ag 


= | app 2G) Om) (A.66) 


But we must also sum over N such contractions (and divide by a factor of N!, 
which represents the number of ways that we can permute these lines). This gives 
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us the following correction factor for virtual photons: 


(e.s) yn 
> air = xP V (A.67) 
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Similarly, we must now calculate the contribution of M emitted real photons 
to the scattering cross section. This means inserting the photon polarizations, 
summing over these polarizations, squaring the matrix element, and integrating 
over the photon’s phase space. Fortunately, the summation over the polarizations 
just gives us a delta function, so the contribution of each emitted photon gives us 
a scalar product between Q(q) and Q(—q), as before. The value of R is therefore: 
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QO(q): O(-4) (A.68) 


By the same logic as before, we must sum over M of these factors, emitted 
photons, giving us R™, and then divide by M!. As before, this gives us a factor 
of exp R. 

The net effect of summing over all possible permutations of the 2N + M 
photon lines, which generate both the real emitted photons and the virtual photon 
loops, is therefore the product of two exponentials. We can now summarize the 
contribution of both the real and the virtual photons by the formula: 


do do 
ey (ae 2V R A.69 
AQ ( a exp(2V) exp ( ) 


The last and final step then involves inserting the actual value of V and R 
into the above formula. These values were already computed for the one photon 
bremsstrahlung process and vertex corrections that were computed earlier. We 
find: 


2V 
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a E2 
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As expected, we find a cancellation between the two factors, yielding an expression 
that is finite and independent of 4. We can now safely take the limit as 1 goes to 
zero, therefore obtaining the correct result without any infrared divergences. 
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A.7 Dimensional Regularization 


The following formulas can be derived by taking the derivative of the formulas 
presented in Chapter 7: 
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(A.71) 
A general formula is given as follows: 
ad"kk ak +k inn” /2 
Gao 7 ee ae a 
(ke +2k-q—m?)* T(a)(—q? — m2)2 P 


where: 
Tragertp = (DP (Qu Qur**- 4u,E(@ — n/2) 


1 
1 2 De (Suiu2Fus ai “Tup) (-9? ~ mT (oe -—1-—n/2) 
perm 


1 
i 4 » (erie 7 1) (—q? nae m’YT(a —2— n/2) 
perm 
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—p/2 
+ (2) ° a OR Biaean Sia. ca) 
perm 


x (<q? — m)P? Ta — p/2—n /2)) (A.73) 
for p even. For p odd, the last term should be: 


od ead S (Suru2 se Lip —2thp 1p) 
perm 


x (—q? — m’)P7ID (a — [p/2] — n/2) (A.74) 
where [7] means taking the largest integer not greater than m. 


By contracting the various k,,, we can also derive a succession of related 
formulas involving k?. 


Notes 


mA & WwW bY 


Chapter 1. Why Quantum Field Theory? 


. Dirac, P.A.M., 1927, Proc. Roy. Soc. Lon. A114, 243. 

. Tomonaga, S., 1946, Prog. Theor. Phys 1, 27; 1948, Phys. Rev. 74, 224. 
. Schwinger, J., 1949, Phys. Rev. 75, 651; 76, 790. 

. Feynman, R. P., 1949, Phys. Rev. 76, 749, 769. 


. For a historical review and more complete references, see: Pais, A., 1986, Inward 


Bound. Oxford: Oxford University Press.; Crease, R. P., and Mann, C. C., 1986, 
Second Creation. New York: Macmillan. 


6. Yukawa, H., 1935, Prog. Phys. Math. Soc. of Japan, 17, 48. 


. Goldberger, M. L., 1955, Phys. Rev. 97, 508; 99, 979; Gell-Mann, M., Goldberger, 


M. L., and Thirring, W., 1954, Phys. Rev. 95, 1612; 96, 1428. 


. Chew, G. F.,, 1962, S-Matrix Theory of Strong Interactions. Reading: Benjamin. 
. Gell-Mann, M., 1961 (unpublished); 1962, Phys. Rev., 125, 1067; 1964, Phys. Lett. 


8, 214. 


. Ne’eman, Y., 1961, Nucl. Phys. 26, 222; Gell-Mann, M., and Ne’eman, Y., 1964, The 


Eightfold Way. Reading: Benjamin. 


. Zweig, C., CERN Rep. 8419/TH 412. 

. Sakata, S., 1956, Prog. Theor. Phys. 16, 686. 

. Ikeda, M., Ogawa, S., and Ohnuki, Y., 1959, Prog. Theor. Phys. 22, 715. 

. Lee, T. D., and Yang, C. N., 1956, Phys. Rev. 104, 254. 

. Wu, C. S., Ambler, E., Hayward, R. W., Hoppes, D. D., and Hudson, R. P., 1957, Phys. 


Rey. 105, 1413. 


. Garwin, R. L., Lederman, L. M., Weinrich, M., 1957, Phys. Rev. 105, 1415. 

. *t Hooft, G., 1971, Nucl. Phys. B33, 173; B35, 167. 

. Weinberg, S., 1967, Phys. Rev. Lett. 19, 1264. 

. Salam, A., 1968, Elementary Particle Theory, ed. N. Svaratholm, Stockholm: 


Almquist and Forlag. 


. Gross, D. J., and Wilczek, F., 1973, Phys. Rev. D8, 3497. 


764 Notes 


21. Politzer, H. D., 1973, Phys. Rev. Lett. 26, 1346. 
22. ’t Hooft, G., 1972, Conference on Lagrangian Field Theory, Marseille (unpublished). 
23. Wilson, K.G., 1974, Phys. Rev. D10, 2445. 


Chapter 3. Spin 0 and 5 Fields 


— 


. Gordon, W., 1926, Z. Physik 40, 117 

. Klein, O., 1927, Z. Physik 41, 407. 

. Fock, V., 1926, Z. Physik 38, 242. 

. Schrédinger, E., 1926, Ann. der Phys. 81, 109. 

. de Donder, Th. and van den Dungen, H., 1926, Comptes Rundus 183, 22. 
. Kudar, J., 1926, Ann. der Phys. 81, 632. 

. Pauli, W., and Weisskopf, V., 1934, Helv. Phys. Act. 7, 709. 

. Dirac, P.A.M., 1928, Proc. Roy. Soc. Lon. A117, 610; 1930, A126, 360. 
Majorana, E., 1937, Nuovo Cim. 14, 171. 

. Weyl, H., 1929, Z. Physik 56, 330. 


eo MON HAMA PF WD 


— 
j=) 


Chapter 4. Quantum Electrodyanics 


. Darwin, C. G., 1928, Proc. Roy. Soc. Lon. A118, 654. 

. Gordon, W., 1928, Z. Physik, 48, 11. 

. Gupta, S. N., 1950, Proc. Phys. Soc. Lon. A63, 681. 

. Bleuler, K., 1950, Helv. Phys. Acta 23, 567. 

. Schwinger, J., 1951, Phys. Rev. 82, 914; 91, 713. 

. Liiders, G., 1954, Kgl. Dansk. Vidensk. Selsk. Mat.-Fys. Medd. 28, 5. 


. Pauli, W., 1955, Niels Bohr and the Development of Physics. New York: McGraw- 
Hill. 


NA WW Ff WW NY 


Chapter 5. Feynman Rules and Reduction 


. Schwinger, J., 1949, Phys. Rev. 75, 651; 76, 790. 

. Tomonaga, S., 1946, Prog. Theor. Phys 1, 27; 1948, Phys. Rev. 74, 224. 

. Feynman, R. P., 1949, Phys. Rev. 76, 749, 769. 

. Mott, N. F,, 1929, Proc. Roy. Soc. Lon. A124, 425. 

. Lehmann, H., Symanzik, K., and Zimmermann, W., 1957, Nuovo Cim. 6, 319. 


nA fk WN — 


Notes : 765 


eo won na 


Oo OAD A WH = 


. Wick, G. C., 1950, Phys. Rev. 80, 268. 

. Furry, W. H., 1937, Phys. Rev. 81, 115. 

. Kallén, G. 1952, Helv. Phys. Acta 52, 417. 
. Lehmann, 1954, Nuovo Cim. 11, 342. 


Chapter 6. Scattering Processes and the S-Matrix 


. Klein, O., and Nishina, Y., 1929, Z. Physik 52, 853. 

. Dirac, PA.M., 1930, Proc. Cam. Phil. Soc. 26, 361. 

. Moller, C., 1932, Ann. Phys. 14, 531. 

. Bhabha, H. J., 1935, Proc. Roy. Soc. Lon. A154, 195, 

. Heitler, W., 1954, The Quantum Theory of Radiation. Oxford: Clarendon Press. 
. Schwinger, J., 1949, Phys. Rev. 75, 651: 76, 790. 


Yennie, D. R., Frautschi, S. C., and Surra, H., 1961, Ann. Phys. 13, 379. 


. Bloch, F., and Nordsieck, A., 1937, Phys. Rev. 52, 54. 

. Pauli, W., and Villars, F, 1949, Rev. Mod. Phys. 21, 433. 

. Schwinger, J., 1948, Phys. Rev. 73, 1256. 

. Combley, F. H., 1979, Rep. Prog. Phys. 42, 1889. 

. Lamb, W. E. and Retherford, R. C., 1947, Phys. Rev. 72, 241. 

. Bethe, H. A., 1947, Phys. Rev. 72, 339. 

. Uehling, E. A., 1935, Phys. Rev. 48, 55. 

. Serber, R., 1935, Phys. Rev. 48, 49. 

. Chew, G. F., 1962, S-Matrix Theory of Strong Interactions. Reading, Benjamin. See 


also: Eden, R. J., Landshoff, P. V., Olive, D. I., and Polkinghorne, J. C., 1966, The 
Analytic S-Matrix. Cambridge: Cambridge University Press. 


. Gell-Mann, M., Goldberger, M. L., and Thirring, W., 1954, Phys. Rev. 95, 1612; 96, 


1428. 


. Goldberger, M. L., 1955, Phys. Rev. 97, 508; 99, 979. 
. Landau, L. D., 1959, Nucl. Phys. 13, 181. 
. Nambu, Y., 1958, Nuovo Cim. 9, 610. 


Chapter 7. Renormalization of QED 


. Dyson, F. J., 1949, Phys. Rev. 75, 486, 1736. 


2. Mills, R. L., and Yang, C. N., 1966, Prog. of Theor. Phys. Suppl. 37-38, 507. 
3. Wu, T. T., 1961, Phys. Rev. 125, 1436. 


766 


Notes 


Salam, A., 1951, Phys. Rev. 82, 217; 84, 426. 


. Steuckelberg, E.C.G., and Petermann, A., 1953, Helv. Phys. Acta 5, 499. 

. Gell-Mann, M., and Low, F.E., 1954, Phys. Rev. 95, 1300. 

. Bollini, C. G., and Giambiagi, J. J., 1964, Nuovo Cim. 31, 550. 

. Bollini, C. G., and Giambiagi, J. J., 1972, Phys. Lett. 40B, 566. 

. *t Hooft, G., and Veltman, M., 1972, Nucl. Phys. B44, 189. 

. *t Hooft, G., 1973, Nucl. Phys. B62, 444. 

. *t Hooft, G., and Veltman, M., 1973, CERN Report 73-9, Diagrammar. 

. *t Hooft, G., 1973, Nucl. Phys. B61, 455. 

. Cicuta, G. M., and Montaldi, E., 1972, Lett. Nuovo Cim. 4, 392. 

. Butera, P., Cicuta, G. M., and Montaldi, E., 1974, Nuovo Cim. 19A, 513. 

. Ashmore, J. F., 1972, Lett. Nuovo Cim. 4, 289. 

. Speer, E. R., 1968, J. Math. Phys. 9, 1404. 

. Speer, E. R., 1971, Comm. Math. Phys. 23, 23. 

. Ward, J. C., 1950, Phys. Rev. 78, 182. 

. Takahashi, Y., 1957, Nuovo Cim. 6, 371. 

. Mills, R. L., and Yang, C. N., 1966, Prog. Theor. Phys. Suppl. 37-8, 507. 

. Wu, T. T., 1962, Phys. Rev. 125, 1436. 

. Velo, G., and Wightman, A. S., 1976, Renormalization Theory. Dordrecht: Riedel. 
. Bjorken, J. D., and Drell, S. D., 1965, Relativistic Quantum Fields. New York: 


McGraw-Hill, pp. 283-363. 


. Weinberg, S., 1960, Phys. Rev. 118, 838. 


Chapter 8. Path Integrals 


. Feynman, R. P., 1948, Rev. Mod. Phys. 20, 267. 
. Feynman, R. P., and Hibbs, A. R., 1965, Quantum Mechanics and Path Integrals. 


New York: McGraw-Hill. 


. Dirac, BA.M., 1933, Physik Z. Sov. Union 3, 64. 
. Nambu, Y., 1968, Phys. Lett. 26B, 626. 


Chapter 9. Gauge Theory 


. Klein, O., 1938, New Theories in Physics, 77, Intern. Inst. of Intellectual Co-operation, 


League of Nations. 


. Yang, C. N., and Mills, R. L., 1954, Phys. Rev. 96, 191. 


Notes 767 


. Shaw, R., 1954, The Problem of Particle Types and Other Contributions to the Theory 


of Elementary Particles, Cambridge University Ph.D. thesis (unpublished). 


. Utiyama, R., 1956, Phys. Rev. 101, 1597. 


5. *t Hooft, G., 1971, Nucl. Phys. B33, 173; B35, 167. 


So OND A F&F WN = 


— 
— 


| 
oOo OD nA BR WH NH 


. Faddeev, L. D., and Popov, V. N., 1967, Phys. Lett. 25B, 29; See also: Mandelstam, 


S., 1962, Ann. Phys. 19, 1. 


. Feynman, R. P., 1963, Acta Physica Polonica 24, 697. 
. Gribov, V.N., 1978, Nucl. Phys. B139, 1. 


Chapter 10. The Weinberg—Salam Model 


. Nambu, Y., 1960, Phys. Rev. Lett. 4, 380. 

. Goldstone, J., 1961, Nuovo Cim. 19, 15. 

. Goldstone, J., Salam, A., and Weinberg, S., 1962, Phys. Rev. 127, 965. 
. Higgs, P. W., 1964, Phys. Lett. 12, 132. 

. Higgs, P. W., 1966, Phys. Rev. 145, 1156. 

. Kibble, T.W.B., 1967, Phys. Rev. 155, 1554. 

. Fermi, E., 1935, Z. Physik 88, 161. 

. Sudarshan, E.C.G., and Marshak, R. E., 1958, Phys. Rev. 109, 1860. 

. Feynman, R. P., and Gell-Mann, M., 1958, Phys. Rev. 109, 193. 

. Weinberg, S., 1967, Phys. Rev. Lett. 19, 1264. 

. Salam, A., 1968, Elementary Particle Theory, ed. N. Svaratholm. Stockholm: 


Almquist and Forlag. 


. Langacker, P., Luo, M., and Mann, A., 1992, Rev. Mod. Phys. 64, 87. 

. Aguilar—Benitez, M. et al., 1992, Review of Particle Properties, Phys. Rev. DAS, 1. 

. Renton, P., 1990, Electroweak Interactions. Cambridge: Cambridge University Press. 
. Sirlin, A., 1984, Phys. Rev. D29, 89. 

. Lee, T. D., and Yang, C. N., 1955, Phys. Rev. 98, 1501. 

. *t Hooft, G., 1971, Nucl. Phys. B33, 173; B35, 167. 

. Coleman, S., and Weinberg, E., 1973, Phys. Rev. D7, 1888. 

. Nambu, Y., and Jona-Lasiniao, G., 1961, Phys. Rev. 124, 246. 


Chapter 11. The Standard Model 


. Sakata, S., 1956, Prog. Theor. Phys. 16, 686. 
. Ikeda, M., Ogawa, S., and Ohnuki, Y., 1959, Prog. Theor. Phys. 22,715. 


Ik}. 


16. 
7. 


18. 
12). 
20. 
ene 
Pup). 
P23). 
24. 
2a: 
26. 
Qe 
28. 


Notes 


. Gell-Mann, M., 1961 (unpublished); 1962, Phys. Rev. 125, 1067. 

. Gell-Mann, M., and Ne’eman, Y., 1964, The Eightfold Way. Reading: W. Benjamin. 
. Ne’eman, Y., 1961, Nucl. Phys. 26, 222. 

. Gell-Mann, M., 1964, Phys. Lett. 8, 214. 

. Zweig, C., CERN Rep. 8419/TH 412. 

. Nishijima, K., 1955, Prog. Theor. Phys. 13, 285. 


Gell-Mann, M., 1956, Nuovo Cim. Supp. 4, 848. 


. Okubo, S., 1962, Prog. Theor. Phys. 27, 949; 28, 24. 

. Sakita, B., 1964, Phys. Rev. 136B, 1756. 

. Giirsey, F., and Radicati, L., 1964, Phys. Rev. Lett. 13, 173. 

. Zweig, G., 1965, in Symmetries in Elementary Particle Physics, ed. A. Zichichi. New 


York: Academic Press. 


. Tarjanne, P., and Teplitz, V. L., 1963, Phys. Rev. Lett. 11, 447; Krolikowski, W., 


1964, Nucl. Phys. 52, 342; Hara, Y., 1963, Phys. Rev. 134B, 701; Bjorken, B. J., 
and Glashow, S. L., 1964, Phys. Lett. 11, 255; Maki, Z., and Ohnuki, Y., 1964, Prog. 
Theor. Phys. 32, 144; Amati, D., Bacry, H., Nuyts, J., and Prentki, J., 1964, Nuovo 
Cim. 34, 1732; Okun, L. B., 1964, Phys. Lett. 12, 250. 


Han, M. Y., and Nambu, Y., 1965, Phys. Rey. 139B, 1006. See also the para- 
statistics formulation of Greenberg, O. W., 1964, Phys. Rev. Lett. 13, 598. 


For reviews of current algebra, see Refs. 16 and 17: 
Adler, S., and Dashen, R., 1968, Current Algebras. New York: Benjamin. 


de Alfaro, V., Fubini, S., Furlan, G., and Rossetti, C., 1973, Currents in Hadron 
Physics. Amsterdam: North-Holland. 


Feynman, R. P., and Gell-Mann, M., 1958, Phys. Rev. 109, 193. 

Nambu, Y., 1960, Phys. Rev. Lett. 4, 380. 

Gell-Mann, M., and Levy, M., 1960, Nuovo Cim. 16, 705. 

Chou, K. C., 1961, Soviet Phys., JETP, 12, 492. 

Goldberger, M. L. and Treiman, S. B., 1958, Phys. Rev. 109, 193. 

Adler, S. L., 1965, Phys. Rev. Lett. 14, 1051. 

Weisberger, W. I., 1965, Phys. Rev. Lett. 14, 1047. 

Cabibbo, N., 1963, Phys. Rev. Lett. 10,531. 

Glashow, S. L., Iliopoulos, J., and Maiani, L., 1970, Phys. Rev. D2, 1285. 
Kobayashi, M., and Maskawa, K., 1973, Prog. Theor. Phys. 49, 652. 


Christensen, J. H., Cronin, J. W., Fitch, V. L., and Turlay, R., 1964, Phys. Rev. Lett. 
13, 138. 


Chapter 12. Ward Identities, BRST, and Anomalies 


. Ward, J. C., 1950, Phys. Rev. 78, 182. 


Notes 769 


. Takahashi, Y., 1957, Nuovo Cim. 6, 371. 

. Taylor, J.C., 1971, Nucl. Phys. B33, 436. 

. Slavnov, A. A., 1972, Theor. and Math. Phys. 10, 99. 

. Becchi, C., Rouet, A., and Stora, R., 1975, Comm. Math. Phys. 52,55. 
. Kugo, T., and Ojima, I., 1978, Phys. Lett. 73B, 459. 

. Adler, S. L., 1969, Phys. Rev. 177, 2426. 

Bell, J. S., and Jackiw, R., 1969, Nuovo Cim. 60A, 47. 

. Bardeen, W. A., 1969, Phys. Rev. 184, 1848. 

. Fujikawa, K., 1979, Phys. Rev. Lett. 42, 1195. 


© OIA DH FP WwW WD 


ht 
oO 


Chapter 13. BPHZ Renormalization of Gauge Theories 


1. Bogoliubov, N. N., and Parasiuk, O., 1957, Acta Math. 97, 227. 
2. Hepp, K., 1966, Comm. Math. Phys. 2, 301. 
3. Zimmerman, W., 1968, Comm. Math. Phys. 11, 1; 1969, 15, 208. 


Chapter 14. QCD and the Renormalization Group 


. Bjorken, J. D., 1969, Phys. Rev.,179, 1547. 

. Feynman, R. P., 1969, Phys. Rev. Lett. 23, 1415. 

. Bjorken, J. D., and Paschos, E.A., 1969, Phys. Rev. 185, 1975. 

Callan, C. G., and Gross, D., 1969, Phys. Rev. Lett. 22, 156. 

. Adler, S., 1966, Phys. Rev. 143, 1144. 

. Gross, D., and Llewellyn Smith, C. H., 1969, Nucl. Phys. B14, 337. 

. Wilson, K. G., 1969, Phys. Rev. 179, 1499. 

. For a complete set of references, see: Frishman, Y., 1974, Phys. Rep. 13C, 1. 
. Stueckelberg, E.C.G., and Petermann, A., 1953, Helv. Phys. Acta 26, 499. 

. Gell-Mann, M., and Low, F. E., 1954, Phys. Rev. 95, 1300. 

. Callan, C. G., 1970, Phys. Rev. D2, 1541. 

. Symanzik, K., 1970, Comm. Math. Phys. 18, 227. 

. Gross, D. J., and Wilczek, F., 1973, Phys. Rev. D8, 3497. 

. Politzer, H. D., 1973, Phys. Rev. Lett. 26, 1346. 

. ’t Hooft, G., 1972, Conference on Lagrangian Field Theory, Marseille (unpublished). 
. *t Hooft, G., 1973, Nucl. Phys. B61, 455. 

. Blaer, A., and Young, K., 1974, Nucl. Phys. B83, 493. 


ew oN AN PWN 


LE ce oe 
a DA A Fk WY KK CO 


770 Notes 


18. Callan, C. G., 1976, in Methods in Field Theory, ed. R. Galian, and J. Zinn-Justin. 
Amsterdam: North-Holland/World Scientific. 


Chapter 15. Lattice Gauge Theory 


. Wilson, K. G., 1974, Phys. Rev. D10, 2445. 
For more complete references, see: Kogut, J. B., 1983, Rev. Mod. Phys. 55, 775. 
Creutz, M., 1979, Phys. Rev. Lett. 43, 553. 


See: Rebbi, C., 1982, in Non-Perturbative Aspects of Quantum Field Theory, ed. J. 
Julve, and M. Ramon-Medrano. Singapore: World Scientific. 


5. Kogut, J., and Susskind, L., 1975, Phys. Rev. D11, 395. 


AWN > 


Chapter 16. Solitons, Monopoles, and Instantons 


1. Russel, J. S., 1844, Rep. 14th Meet. Brit. Assoc. Adv. Sci., 311. London: John 
Murray. 


2. For more references, see: Rajaraman, R., 1989, Solitons and Instantons. Amsterdam: 
North-Holland. 


. Dirac, PA.M., 1931, Proc. Roy. Soc. A133, 60. 
. *t Hooft, 1974, Nucl. Phys. B79, 276. 
. Polyakov, A. M., 1974, JETP Lett. 20, 194. 


. Belavin, A. A., Polyakov, A. M., Schwartz, A. S., and Tyupkin, Yu. S., 1975, Phys. 
Lett. 59B, 85. 


7. For more references on instantons, see: Coleman, S., 1985, Aspects of Symmetry. 
Cambridge: Cambridge University Press; ’t Hooft, G., 1976, Phys. Rev. Lett. 37, 8. 


8. *t Hooft, G., 1976, Phys. Rev. Lett. 37, 8. 

9. Jackiw, R., and Rebbi, C., 1976, Phys. Rev. Lett. 37, 172. 
10. Pecci, R. D., and Quinn, H. R., 1977, Phys. Rev. D16, 1791. 

11. Dine, M., Fischler, W., Srednicki, M., 1981, Phys. Lett. 104B, 199. 


nH nr & W 


Chapter 17. Phase Transitions and Critical Phenomena 


1. Baxter, R. J., 1982, Exactly Solved Models in Statistical Mechanics. San Diego: 
Academic Press. 


2. For further references, see: Domb, C., and Lebowitz, J. L., eds., 1986, Phase Transi- 
tions and Critical Phenomena 10. San Diego: Academic Press. 


3. Ising, E., 1925, Z. Physik 31, 253. 


Notes 771 


4. Onsager, L., 1944, Phys. Rev. 65, 117. 


. McCoy, B. M. and Wu, T. T., 1973, The Two Dimensional Ising Model. Cambridge: 
Harvard University Press. 


6. Yang, C. N., 1952, Phys. Rev. 85, 808. 
7. Landau, L. D., 1937, Phys. Zurn. Sowjetunion 11, 26, 545. 
8. Ginzburg, V. L., and Landau, L. D., 1950, JETP 20, 1064. 
9. Kadanoff, L. P., 1965, Physics 2, 263. 

10. Wilson, K. G., and Kogut, J., 1974, Phys. Rep. 12C, 76. 


11. Brezin, E., Le Guillou, J. C., Zinn-Justin, J., and Nickel, B. G., 1974, Phys. Lett. 44A. 
2217. 


Nn 


Chapter 18. Grand Unified Theories 


. Georgi, H. M., Quinn, H. R., and Weinberg, S., 1974, Phys. Rev. Lett. 33, 451. 
. Pati, J. C., and Salam, A., 1973, Phys. Rev. Lett. 31, 275. 

. Georgi, H., and Glashow, S. L., 1974, Phys. Rev. Lett. 32, 438. 

. Gildener, E., 1976, Phys. Rev. D14, 1667. 

. Fritzsch, H., and Minkowski, P., 1975, Ann. Phys. 93, 193. 


. Georgi, H., 1975, in Particles and Fields—1974, ed. C. E. Carlson. New York: AIP 
Press. 


. Girsey, F., Ramond, P., and Sikivie, P., 1976, Phys. Lett. 60B, 177. 
8. Farhi, E., and Susskind, L., 1981, Phys. Rep. 74, 277. 


Nn A F&F W NY 


~l 


Chapter 19. Quantum Gravity 


. Einstein, A., 1915, Sitzungsber. Preuss. Akad. Wiss. 778, 779, 844. 
. Hubble, E. P., 1936, Astrophys. J. 84, 270. 

. Gamow, G., 1946, Phys. Rev. 70, 572. 

. Alpher, R. A., Bethe, H., and Gamow, G., 1948, Phys. Rev. 73, 803. 
. Friedman, A., 1922, Z. Physik 10, 377. 

. Robertson, H. P., 1935, Astrophys. J. 82, 284. 

Sakharov, A. D., 1967, Zh, Ek. Teor. Fiz. 5,24. 

. Yoshimura, M., 1978, Phys. Rev. Lett. 41, 281. 

. Guth, A. H., 1981, Phys. Rev. D23, 347. 

. Linde, A. D., 1982, Phys. Lett. 108B, 389. 

. Linde, A. D., 1984, Rep. Prog. Phys. 47, 925. 


wo MAN AA PWN = 


— — 
— © 


772 Notes 


12. Kaluza, Th., 1921, Sitz. Preuss. Akad. Wiss K1, 966. 
13. Klein, O., 1926, Z. Phys. 37, 895. 


14. DeWitt, B. S., 1963, in Dynamical Theory of Groups and Fields, 1963 Les Houches 
Summer School. 


15. For more references, see: Appelquist, T., Chodos, A., and Freund, P.G.O., 1987, 
Modern Kaluza—Klein Theories. Reading: Addison-Wesley. 


16. Witten, E., 1981, Nucl. Phys. B186, 412. 


17. Goroff, M. H., and Sagnotti, A., 1985, Phys. Rev. 160B, 81; 1986, Nucl. Phys. B266, 
709. 


18. van de Ven, A.E.M., DESY-91-115. 


Chapter 20. Supersymmetry and Supergravity 


1. Myazawa, H., Prog. Theor. Phys. 1966, 36, 1266; 1968, Phys. Rev. 170, 1586. 


2. Neveu, A., and Schwarz, J. H., 1971, Nuc! Phys. B31, 86; Ramond, P., 1971, Phys. 
Rey. D3, 2415. 


. Gervais, J. L., and Sakita, B., 1971, Nucl. Phys. B34, 632. 

. Gol’fand, Yu. A., and Likhtman, E. P., 1971, Sov. Phys.: JETP Lett. 13, 323. 
. Volkov, D. V., and Akuloy, V. P., 1972, JETP Lett. 16, 438. 

Wess, J., and Zumino, B., 1974, Nucl. Phys. B70, 34. 


. Mandelstam,S., 1982, Proc. 21st. Int. Conf. on High Energy Physics, ed. P. Petiau, 
and J. Pomeuf. J. Phys. 12, 331; 1983, Nucl. Phys. B213, 149. 


. Brink, L., Lindgren, O., and Nilsson, B., 1983, Phys. Lett. 123B, 323. 
9. Howe, P., Stelle, K., and Townsend, P., 1983, Nucl. Phys. B212, 401. 

10. Grisaru, M., and Siegel, W., 1982, Nucl. Phys. B236, 125. 

11. Sohnius, M., and West, P., 1981, Phys. Lett. 100B, 45. 

12. Flume, R., 1983, Nucl. Phys. B217, 531. 


13. Freedman, D. Z., van Nieuwenhuizen, P., and Ferrara, S., 1976, Phys. Rev. D13, 
3214. 


14. Deser, S., and Zumino, B., 1976, Phys. Lett. 62B, 335. 


15. For another approach based on supermetric tensors, see: Amowitt, R., and Nath, P, 
1975, Phys. Lett. 56B, 177. 


16. Salam, A., and Strathdee, J., 1974, Phys. Lett. 51B, 353. 
17. Wess, J., and Zumino, B., 1974, Nucl. Phys. B70, 39. 
18. Fayet, P., and Illiopoulos, J., Phys. Lett. 51B, 461. 

19. O’Raifeartaigh, L., 1975, Nucl. Phys. B96, 331. 


20. See: Gates, S. J., Grisaru, M. T., Rocek, M., and Siegel, W., 1983, Superspace: Or 
One Thousand and One Lessons in Supersymmetry. Reading: Benjamin/Cummings. 


IAW Fs WwW 


oo 


Notes 773 


Palle 
22. 
ZS. 


Haag, R., Lopuszanski, J. T., and Sohnius, M. F., 1975, Nucl. Phys. B88, 257. 
Kaku, M., Townsend, P., and van Nieuwenhuizen, P., 1978, Phys. Rev. D17, 3179. 
Cremmer, E., Julia, B., and Scherk, J., 1978, Phys. Lett. 76B, 409. 


Chapter 21. Superstrings 


. Polyakov, A. M., 1981, Phys. Lett. 103B, 207, 211. 

. Nambu, Y., 1970, Lectures at the Copenhagen Summer Symposium. 

. Goto, T., 1971, Prog. Theor. Phys. 46, 1560. 

. See also the earlier work of: Susskind, L., 1970, Nuovo Cim. 69A, 457; Nielsen, H. 


B., 1970, 15th Int. Conf. of High Energy Phys., Kiev. 


. Hsue, C. S., Sakita, B., and Virasoro, M. B., 1970, Phys. Rev. D2, 2857. 
. Virasoro, M. A., 1969, Phys. Rev. Lett. 22, 37. 
. Goddard, P., Goldstone, J., Rebbi, C., and Thorn, C. B., 1973, Nucl. Phys. B56, 109. 


Kato, M., and Ogawa, K., 1983, Nucl. Phys. B212, 443. 


. Kikkawa, K., Sakita, B., and Virasoro, M. B., 1969, Phys.Rev. 184, 1701. 
. Bardakci, K., and Ruegg, H., 1969, Phys. Rev. 181, 1884. 

. Virasoro, M. A., 1969, Phys. Rev. Lett, 22, 37. 

. Goebel, C. J., and Sakita, B., Phys. Rev. Lett. 22, 257. 

. Chan, H. M., Phys. Lett. 28B, 425. 

. Koba, Z. J., and Nielsen, H. B., 1969, Nucl. Phys. B12, 517. 

. Veneziano, G., 1976, Nucl. Phys. B117, 519. 

. Suzuki, M. (unpublished). 

. Fubini, S., Gordon, D., and Veneziano, G., 1969, Phys. Lett. 29B, 679. 

. Virasoro, M. A., 1969, Phys. Rev. 177, 2309. 

. Shapiro, J., 1970, Phys. Lett. 33B, 361. 

. Ramond, P., 1971, Phys. Rev. D3, 2415. 

. Neveu, A., and Schwarz, J. H., 1971, Nucl Phys. B31, 86. 

. Gervais, J. L., and Sakita, B., 1971, Nucl. Phys. B34, 632. 

. Green, M., and Schwarz, J. H., 1982, Nucl Phys. B198, 252, 441. 

. Gross, D. J., Harvey, J. A., Martinec, E., and Rohm, R., 1985, Phys. Rev. Lett. 54, 


502. 


. Shapiro, J., 1972, Phys. Rev. D5, 1945. 
. Kaku, M., and Yu, L. P, 1970, Phys. Lett. 33B, 166; 1971, Phys. Rev. D3, 2992, 


3007, 3020. 


. Lovelace, C., 1970, Phys. Lett. 32B, 703; 1971, 34B, 500. 
. Allesandrini, V., 1971, Nuovo Cim. 2A, 321. 


774 Notes 


29. For more references, see: D’Hoker, E., and Phong, D. H., 1988, Rev. Mod. Phys. 60, 
917. 


30. Candelas, P., Horowitz, G., Strominger, A., and Witten, E., 1985, Nucl Phys. B258, 
46. 


31. Dixon, L., Harvey, J., Vafa, C., and Witten, E., 1985, Nucl. Phys. B261, 678. 
32. Kawai, H., Lewellen, D. C., and Tye, S.H.H., 1986, Phys. Rev. Lett. 57, 1832. 
33. Antoniadis, I., Bachas, C., and Kounnas, C., 1987, Nucl. Phys. B289, 87. 

34. Lerche, W., Lust, D., and Schellekens, A. N., 1987, Nucl. Phys. B287, 477. 
35. Kaku, M., and Kikkawa, K., 1974, Phys. Rev. D10, 1110, 1823. 

36. Witten, E., 1986, Nucl. Phys B268, 253. 

37. Kaku, M., 1990, Phys. Rev. D41, 3733. 

38. Kugo, T., Kunitomo, H., and Suehiro, K., 1989, Phys. Lett. B226, 48. 

39. Saadi, M., and Zwiebach, B., 1989, Ann. Phys. 192, 213. 


References 


Field Theory 


. Bjorken, J. D., and Drell, S. D. 1964. Relativistic Quantum Mechanics. New York: 


McGraw-Hill. 


. Bjorken, J. D., and Drell, S. D. 1965. Relativistic Quantum Fields. New York: 


McGraw-Hiil. 


. Boboliubov, N. N., and Shirkov, D. V. 1959. Introduction to the Theory of Quantized 


Fields. New York: Wiley. 


. Chang, S. J. 1990. Introduction to Quantum Field Theory. Singapore: World Scien- 


tific. 


5. Collins, J. 1984. Renormalization. Cambridge: Cambridge University Press. 


6. Itzykson, C., and Zuber, J-B. 1980. Quantum Field Theory. New York: McGraw-Hill. 


7. Jauch, J. M., and Rohrlich, F. 1955. The Theory of Photons and Electrons. Reading: 


Addison-Wesley. 


8. Mandl, F, and Shaw, G. 1984. Quantum Field Theory. New York: Wiley. 


. Ramond, R. 1989. Field Theory: A Modern Primer. Reading: Addison-Wesley . 

. Ryder, L. H. 1985. Quantum Field Theory. Cambridge: Cambridge University Press. 
. Sakurai, J. J. 1967. Advanced Quantum Mechanics. Reading: Addison-Wesley. 

. Schweber, S. S. 1961. An Introduction to Relativistic Quantum Field Theory. New 


York: Harper & Row. 


. Schwinger, J. 1958. Quantum Electrodynamics. New York: Dover. 


. Zinn-Justin, J. 1989. Quantum Field Theory and Critical Phenomena. Oxford: Oxford 


University Press. 


Gauge Theories 


. Cheng, T.-P., and Li, L.-F. 1984. Gauge Theory of Elementary Particle Physics. 


Oxford: Oxford University Press. 


776 References 


2. Faddeev, L. D., and Slavnov, A. A. 1980. Gauge Fields: Introduction to Quantum 
Theory. Reading: Benjamin/Cummings. 


3. Frampton, P. H. 1987. Gauge Field Theories. Reading: Benjamin/Cummings. 


4. Muta T. 1987. Foundations of Quantum Chromodynamics. Singapore: World Scien- 
tific. 


5. Pokorski, S. 1987. Gauge Field Theories. Cambridge: Cambridge University Press. 


Particle Physics 


1. Becher, P., Bohm, M., and Joos, H. 1984. Gauge Theories of Strong and Electroweak 
Interaction. New York: Wiley. 


2. Gasiorowicz, S. 1966. Elementary Particle Physics. New York: Wiley. 
3. Huang, K. 1982. Quarks, Leptons, and Gauge Fields. Singapore: World Scientific. 


4. Lee, T. D. 1981. Particle Physics and Introduction to Field Theory. New York: 
Harwood Academic. 


5. Renton, P. 1990. Electroweak Interations. Cambridge: Cambridge University Press. 
6. Ross, G. G. 1985. Grand Unified Theories. Reading: Benjamin/Cummings. 


Critical and Non-Perturbative Phenomena 


1. Amit, D. J. 1978. The Renormalization Group and Critical Phenomena. New York: 
McGraw-Hill. 


2. Creutz, M. 1983. Quarks, Gluons and Lattices. Cambridge: Cambridge University 
Press. 


3. Ma, S.-K. 1976. ModernTheory of Critical Phenomena. Reading: Benjamin/Cummings. 
4. Rajaraman, R. 1989. Solitons and Instantons. Amsterdam: North-Holland. 


5. Rebbi, C. 1983. Lattice Gauge Theories and Monte Carlo Simulations, Singapore: 
World Scientific. 


6. Sakita, B. 1985. Quantum Theory of Many-Variable Systems and Fields. Singapore: 
World Scientific. 


Supergravity 


1. Gates, S. J., Grisaru, M. T., Rocek, M., and Siegel, W. 1983. Superspace. Reading: 
Benjamin/Cummings. 


2. Jacob, M., ed. 1986. Supersymmetry and Supergravity. Amsterdam: North-Holland 
and World Scientific. 


References dat 


3. Mohapatra, R. N. 1986. Unification and Supersymmetry: The Frontiers of Quark— 
Lepton Physics. New York: Springer-Verlag. 


4. West, P. 1990. Introduction to Supersymmetry and Supergravity. Singapore: World 
Scientific. 


Superstrings 


1. Frampton, P. H. 1974. Dual Resonance Models. Reading: Benjamin/Cummings. 


2. Green, M. B., Schwarz, J. H., and Witten, E. 1987. Superstring Theory. Vols. I and 
IJ. Cambridge: Cambridge University Press. 


3. Jacob, M., ed. 1974. Dual Theory. Amsterdam: North-Holland. 
4. Kaku, M. 1988. Introduction to Superstrings. New York: Springer-Verlag. 


5. Kaku, M. 1991. Strings, Conformal Fields, and Topology. New York: Springer- 
Verlag. 


6. Schwarz, J. H., ed. 1985. Superstrings. Vols. | and II. Singapore: World Scientific. 


Index 


Abelian group, 42 

Action principle, 16-21 

Adjoint representation, 47 

Adler sum rule, 470 
Adler—Bell—Jackiw anomaly, 414-20 
Adler—Weisberger sum rule, 393-96 
Advanced Green’s function, 74 
Analyticity, 164, 199-204 

Angular momentum, 25, 28-29 
Annihilation operator, 67 
Anomalous dimension, 480 
Anomalous magnetic moment, 104, 189-94 
Anomaly, 380, 414-20, 423-29, 612 
Anti-commutator, 48, 87 

Antimatter, 71, 75, 87, 90-91 
Anti-unitary, 120 

Area law, 514 

Asymptotic freedom, 12, 378, 460, 483-88, 609 
Asymptotic series, 452-53 
Asymptotic states, 141-51 
Atiyah—Hirzebruch theorem, 656 
Auxiliary field, 667-68, 697 

Axial gauge, 107 

Axions, 564-65 


Background field method, 500, 658 
Backlund transformation, 535 
Baker—Campbell—Hausdorff theorem, 46 
Bare parameters, 210, 219 

Baryon, 7-8 

Baryon number conservation, 15, 379, 561, 621 
Beta decay, 5, 8, 333 

Beta function, 478-91, 610, 685 
Bethe—Heitler cross section, 180 
Bhabha scattering, 176-77 

Bianchi identity, 297-98 

Big Bang cosmology, 15, 643-49 
Bjorken scaling, 459-65 
Bloch—Nordsieck, 180 

Block spin method, 590-97 
Bogoliubov’s R operation, 442, 448 
Bohr magneton, 104 

Boltzmann partition function, 571 

Bore] transformation, 453-56 


Bottom quark, 371, 384 

BPHZ renormalization, 211, 441-51 
Braid group, 584 

Bremsstrahlung, 177-84 

BRST quantization, 62, 412-14, 711 
BV quantization, 62 


Cabibbo angle, 397 
Calabi-Yau manifold, 726, 729 
Callan—Gross sum rule, 465 
Callan—Symanzik relation, 485-88, 597-99, 610 
Canonical quantization 

fermion fields, 86-88 

scalar fields, 64-65 

vector fields, 110-11 
Cartan subalgebra, 748 
Casimir operator, 55, 59 
Causality, 34, 74, 76, 90, 190 
Chan—Paton factor, 722 
Charge conjugation, 117-19 
Charged scalar field, 69-72 
Charge renormalization, 188, 232, 243 
Charm, 371-72, 400 
Chiral anomaly, 380, 414-20, 562, 612 
Chiral superfield, 674 
Chiral symmetry, 379-80 
Christoffel symbol, 636 
Clifford algebra, 48, 78, 625 
Coherent states, 715 
Coleman—Mandula theorem, 57, 369, 664, 695 
Coleman—Weinberg mechanism, 348-57 
Color, 12, 374-75 
Compact groups, 34 
Compactification, 650-56, 694 
Compton scattering, 165-70 
Confinement, 378-79, 512-25 
Conformal field theory, 580, 584, 727 
Conformal transformation, 500, 574, 580, 688, 

706 

Connected graphs, 279-83 
Connection field, 296, 636-37 
Conserved Vector Current (CVC), 390-91 
Cooper pairs, 360 
Correlation length, 573 


219 


780 


Cosets, 545 
Cosmology 
Big Bang theory, 643-49 
constant, 639, 644, 649-50 
CP violation, 646 
helium abundance, 643 
Hubble’s law, 643 
inflation, 648-49 
microwave radiation, 643 
nucleosynthesis, 643 
red shift, 643 
Coulomb gauge, 107, 307-11, 314-18 
Coulomb scattering, 134-40 
Coulomb term, 109 
Counterterms, 211, 226, 233, 433-35, 450, 658 
Covariant derivative 
gauge theory, 103, 296 
general relativity, 636 
CP violation, 120, 402, 559-65, 646 
C Py, 568-69 
C P2, 655-56 
CPT theorem, 120—23 
Creation operators, 67 
Critical exponents 
block spin, 597 
epsilon expansion, 603 
Ginzburg-Landau, 587 
Ising, 578 
mean field theory, 586 
scaling, 590 
Crossing symmetry, 164, 199 
Cross section, 127—34 
Curie law, 585 
Current, 24 
Current algebra, 384-96 
Current conservation, 24 


Decay rate, 133-34 

Decuplets, 365, 370 

Deep inelastic scattering, 459-70 
Desert hypothesis, 15, 611 

De Sitter group, 499, 689 

De Sitter universe, 648 

Detailed balance, 520 

Dilute gas approximation, 549-50 
Dimensional reduction, 650-56 
Dimensional regulation, 235, 436-41 
Dimensicnal transmutation, 348, 353, 358 
Dimension of fields, 213 

Dirac equation, 77 

Dirac matrices, 77-92 

Dirac monopole, 539-43 

Dirac representation, 82, 753 

Dirac sea, 90 


Index 


Dirac spinor, 77-92 

Dirac string, 540-43 

Disconnected graphs, 143-45 

Dispersion relations, 199-204 
Dyson—Ward renormalization, 250-55, 450 


E¢, 626, 630, 741 
E>, 741 
Eg ® Eg string, 700, 723 
Effective potential, 348-57 
Eightfold way, 364 
Electric field, 99 
Electromagnetic tensor, 100 
Electron—electron scattering, 173-76 
Electron—position scattering, 176-77 
Electroweak interactions, 11, 335-57 
Energy—momentum tensor, 27, 101, 708, 718 
Epsilon expansion, 597-605 
Equivalence principle, 30, 633 
Euclidean metric 
Callan-Symanzik, 488 
instantons, 455, 530, 548, 554-56 
lattice, 505-6 
path integrals, 263-64 
Euler—Lagrange equations, 18, 23 
Euler number, 607 
Exceptional groups, 741 
Exotics, 8, 377 


F4, 741 
Faddeev—Popov ghosts, 301-4 
Faddeev—Popov quantization, 298-304, 711 
Fayet—Iliopoulos mechanism, 677 
Fermi action, 8, 333, 386 
Fermi constant, 333 
Fermion doubling problem, 511 
Feynman gauge, 107 
Feynman parameter, 182 
Feynman path integral, 261-91 
Feynman propagator, 74, 91-92, 138-39 
Feynman rules 
gauge theory, 304-6 
gravity, 657-60 
QED, 157-58 
scalar theory, 158-59 
supersymmetric theories, 680-83 
Fiber bundles, 541-42 
Fierz identity, 95-96 
Fixed point, 481 
Flatness problem, 648-49 
Flavor, 12, 373-74 
Flux, 129, 132-33 
Fock space, 68-69, 707, 713, 719 
Foldy—Wouthuysen transformation, 97 


Index 


Forests, 447-51 

Form factor, 190, 392, 404, 462, 468 
Forms, 125 

Free energy, 572 

Friedman universe, 645 

Fujikawa’s method, 42429 
Functional integration, 264-73 
Fundamental representation, 364, 742 
Furry’s theorem, 160, 228 


G2, 741 
Gauge fixing 

axial, 107 

Coulomb, 107, 307-11, 314—18 

Feynman , 107, 343 

Landau, 107, 343 

Rr, 342-44 

renormalizable, 107, 344 

’t Hooft, 345-46 

unitary, 107, 344 
Gauge theory, 295-318 
Gauss—Bonnet theorem, 659 
Gaussian integration, 266-67, 275—76, 587 
Gauss’s law, 109, 308, 522 
Gell-Mann-Nishijima relation, 365 
Gell-Mann—Okubo relation, 369 
General covariance, 634-39 
General relativity, 633, 639-60 
Generating functional, 275-78 
Generation problem, 9, 13, 374, 384, 402, 729 
Ghosts, 9, 62, 112-14, 301-4, 412-14, 711 
Ginzburg-Landau model, 360, 586-87, 592-94 
GL(N), 54, 640 
Glashow-Iliopoulous—Maiani (GIM) mechanism, 

399-400 

Glashow—Weinberg—Salam model, 384, 423 
Global symmetry, 34, 101, 296 
Goldberger—-Treiman relation, 393 
Goldstone boson, See Nambu—Goldstone bosons 
Gordon identity, 96 
Grand Unified Theory (GUT) 

SU (5), 611-22 

SO(10), 622-26 
Grassmann 

integration, 286-87 

number, 285-89 
Graviton, 657 
Gravity theory, 633-60 
Green-Schwarz string, 720-21 
Green’s function, 73, 91-92, 267, 714 
Gribov ambiguity, 311-14 
Gross—Lwelleyn Smith sum rule, 470 
Gross—Neveu model, 357-58 
Group measure, 299-300, 515-17, §25—-26, 557 


781 


Group theory, 33~—57, 741-50 
Gupta—Bieuler quantization, 62, 112-14, 705 
Gyromagnetic ratio, 104, 189, 194 


Hadron, 6, 8 

Hamilton’s equations, 19 

Han—Nambu model, 376 

Harmonic oscillators, 65-69, 706-8 

Heisenberg model, 580 

Heisenberg picture, 147 

Heisenberg representation, 147 

Heisenberg Uncertainty, 17 

Helicity, 56, 691 

Heterotic string, 721-22 

Hierarchy problem , 622, 628, 683 
GUTs, 15, 622 
supersymmetry, 664 
technicolor, 627 

Higgs mechanism, 326-32, 619-22 

Hodge operator, 125 

Hole theory, 4, 90 

Homotopy, 537, 544-45, 558-59 

Horizon problem, 648-49 

Hubble’s constant, 643 

Huygen’s principle, 135 

Hypercharge, 364-68 


Inflation, 648-49 
Infrared divergence, 
755-6059 
Infrared slavery, 378, 485 
Infrared stable, 481 
“In” states, 141-44 
Instantons 
Borel transform, 455 
dilute gas approximation, 549-50 
quantum mechanics, 545-54 
strong C P problem, 559-65 
6 vacua, 402, 559-65 
U(1) problem, 559-63 
WKB, 545-53 
Intercept, 707-8 
Internal symmetry, 34 
Invariant group measure, 299-300, 515-17, 
525-26, 557 
IRF models, 580-81 
Ising model, 575-80 
Isometry, 653 


177, 180-84, 194-96, 


Jacobi identity, 297 
Jets, 380-84 


Kallén—Lehmann spectral representation, 161 
Kaluza—Klein model, 650-56, 727 


782 


Killing vector, 653 

Kirchhoff’s laws, 207 

Klein—Gordon equation, 23, 63-69, 703 
Klein-Nishina formula, 169 

K meson, 120, 366, 393, 396 

Knot theory, 584 

Kobayashi—Maskawa matrix, 401-2 


Ladder operators, 747-48 

Lagrangian, 17-20 

Lamb shift, 106, 196-99 

Landau gauge, 107, 314-18 

Landau singularity, 483 

Landau’s mean field theory, 584-88 

Large N, 607 

Lattice gauge theory, 13, 505-25 

Lepton, 8 

Lie algebra, 43, 654 

Lie group, 7, 33-57, 741-50 

Light-cone expansion, 470-76 

Light-cone string field theory, 730-32 

Light-like separation, 50 

Local symmetry, 34, 101, 296 

London’s equation, 360 

Loop expansion, 163, 284-85 

Lorentz group, 28, 34, 50-51, 67, 78-81, 641, 
710, 749-51 

LSZ reduction formalism, 141-56 


Magnetic moment, 104, 189, 194 
Magnetization, 572 
Majorana representation , 93, 754 
Majorana mass, 630-31 
Mandelstam variables, 202, 223 
Mass independent regularization, 479, 488-91 
Massive vector field, 114, 215 
Mass shell, 63, 135 
Master groups, 35, 57 
Maxwell’s equations, 99-102, 125 
Mean field approximation, 584—88 
Meissner effect, 361 
Metropolis algorithm, 518 
Michel parameters, 341 
Minimal SU(5) model, 15, 611-22 
Minimal subtraction, 479, 488-91 
Mixing angles, 396-402 
Modular symmetry, 725, 738 
Moller scattering, 173—77 
Monopoles 

Dirac, 539-43 

*t Hooft—Polyakov, 543-45 
Monte Carlo simulation, 518—21 
Mott cross section, 140 
Muon, 8, 11, 335, 384 


Index 


Muon neutrino, 9, 11, 335, 384 
Minimal SU/(5) model, 15, 611-22 


1/N expansion, 607-8 

Nambu-—Goldstone bosons, 326-42, 391 

Nambu-Goto action, 705 

Nambu-—Jona—Lasinio model, 358 

Negative energy, 63, 75, 85, 87, 90 

Nested graphs, 448 

Neutrino, 93-94, 384 

Neutrino sum rules, 467-70 

Neveu—Schwarz model, 717-21 

Newton’s constant, 6, 9-10, 633, 657, 640 

Noether’s theorem, 23~30 

No-go theorem, 35, 57, 79, 96, 123, 369, 664, 
695 

Non—Abelian groups, 42 

Non-compact groups, 34 

Non-leptonic decays, 398-99 

Non-renormalization theorems, 15, 35, 682-88 

Normal ordering, 68, 152, 682-88 

Neveu—Schwarz—Ramond strings, 663, 666, 717 

Nucleosynthesis, 643 


Octets, 365-70 

Off-shell, 127, 135, 667-68 

O¢4, 2), 500 

O(N), 4648 

One-particle irreducible, 219, 280, 354 

On-shell, 127, 135, 667-68 

Operator product expansion, 470-76 

Optical theorem, 201 

O’Raifeartaigh mechanism, 678 

Orbifolds, 727 

Orthochronous Lorentz group, 751 

Orthogonal group, 37 

Osp(1/4), 688, 692-93 

Osp(N/M), 688 

“Out” states, 141-45 

Overlapping divergence, 210, 247-49, 255-56, 
443-47 


Pair annihilation, 170-73 

Parity transformation, 38, 117 

Parity violation, 9, 120 

Partially Conserved Axial Current (PCAC), 
389-96 

Parton model, 463-70 

Path integrals, 261-91 

Pauli—Lubanski vector, 55 

Pauli spin matrices, 44 

Pauli—Villars cut-off, 185, 235, 420, 656 

Peccei—Quinn symmetry, 564-65 

Perimeter law, 514 


Index 


Phase transition, 571-605 

Pion decay, 420-24 

Pion—-nucleon scattering, 202-4, 392 
Planck length, 10, 15, 645, 650, 656, 700, 736 
Plaquette, 506-7 

Planck’s constant, 16-17, 284 

Poincaré group, 34, 51-57, 67, 89, 689 
Poisson brackets, 30, 31 

Polarization, 84, 110 

Power counting, 211, 218-19, 227-28, 433 
Poynting vector, 102 

Preon, 627 

Propagator theory, 72~76, 134-40 

Proper graphs, 219, 280, 354 

Proton decay, 15, 621 

Pseudoscalar, 80 

Pseudotensor, 41 


Quantization 
BRST, 62, 412-14, 711 
BV, 62 
equal-time, 64-65, 86-88, 110-11 
Gupta-—Bleuler, 62, 112-14, 705 
light-cone, 470-75, 730-32 
second, 21, 273-78, 701, 730 
Quantum chromodynamics (QCD) 
anomaly cancellation, 380 
asymptotic freedom, 378, 483-85 
current algebra, 384-89 
jets, 376-77 
sum rules, 463—75 


Quantum electrodynamics (QED), 2, 99-123, 


453-56 
Quarks, 34, 363-84 
confinement, 378-79, 512—15, 522-23 
currents, 384-96 
jets, 380-84 
model, 34, 363-84 
masses, 374 
partons, 463-70 


Radiative corrections 
effective potential, 348-56 
scattering, 184-99 

Rank, 54 

Red shift, 643 

Regge behavior, 737 

Regge intercept, 707-8 

Regge trajectory, 707-8 

Regularization 
dimensional, 235, 436-41 
lattice, 236 
MS, 479, 488-91 
Pauli—Villars, 185, 235, 656 


783 


Renormalization group, 234, 476-98 
Renormalization theory 

BPHZ, 211, 441-51 

counterterms, 211, 226, 233, 433-35, 450 

Dyson/Ward, 250-55, 450 

gauge theory, 431-51 

group, 234, 476-98 

multiplicative, 211, 224, 232, 254, 435 

point, 223, 233-35 

QED, 250-56 

renormalization group proof, 494-98 
Representation 

adjoint, 47 

fundamental, 364, 742 

irreducible, 39 

reducible, 39 

spinor, 48, 78, 625, 640 

tensor, 40, 743 
Retarded Green’s function, 74 
Re gauge, 343-44 
Ricci tensor, 638 
Riemann curvature, 638, 656-59 
Riemann surface, 712-3 
Robertson—Walker universe, 643-45 
Rosenbluth formula, 205 
Running coupling constant, 480-83, 609-11 
Rutherford scattering, 139-40 


Sie 
S2, 536-8 
$3, 558 
Sakata model, 7, 364 
Scalar field, 23, 63-69 
Scaling 
Bjorken, 459-65 
block spins, 574, 590-97 
current commutators, 470-75 
parton model, 463—70 
violations, 491—94 
Schrédinger equation, 20, 104, 272-73 
Schrédinger picture, 147 
Schrédinger representation, 147 
Schwinger—Dyson equations, 257, 288-91 
Schwinger terms, 389 
Seagull graph, 159 
Second quantization, 21, 273-78, 701, 730 
Self-dual tensor, 554 
Self-energy graph, 220-22, 228-32, 240-41, 
243-49, 439-4] 
Sigma model, 536-37 
Sine—Gordon equation, 533-36 
Skeleton graph, 247-49, 447-51 
SL(2, C), 59, 670 
Slavnov-Taylor identities, 411-12, 435 


784 


Slepton, 679 
S matrix, 57, 130, 141, 199 
S matrix theory, 199-200 
SO(2), 39-42 
SO(3), 42-45 
SOG, 1), 50 
SOA), 48 
SO(6), 48 
S$ O(8) supergravity, 665, 692-96 
SO(10), GUT, 15, 622-26, 629 
SO(N), 46, 625-26 
Solitons, 529-39 
Space-like separation, 50, 76 
Specific heat, 573 
Spectral representation, 616 
Spinor representation, 48, 78, 625, 640 
Spin-statistics, 74, 76, 88, 90, 375 
Sp(N), 688, 741 
Spontaneous symmetry breaking 
Higgs mechanism, 11, 326-32, 619-22, 649 
Nambu-Goldstone bosons, 326-32, 391 
superconductivity , 360-61 
Weinberg—Salam model, 335-37 
Squark, 679 
Standard model, 13-14, 363-403, 619, 655, 665, 
695, 727 
Strangeness, 364-66 
Strange quark, 364 
Strings 
bosonic, 704-17 
Dirac, 540-43 
heterotic, 721-22 
super, 717-21 
tension, 512, 520-23 
types, 721-22 
Strong coupling, 514—17 
Strong C P problem, 559-65 
Strong interactions, 6-8 
Structure constants, 43, 47 
SU(2), 42-45 
SU(, 2), 500 
SU(2) @ SU(2), 48, 391, 561 
SU(3) 
algebra, 747-48 
color, 12, 374-75 
flavor, 363-71 
SU(3) @ SU(3), 379 
SU (4), 48, 59, 371-73 
SU(5) GUT, 15, 611-22 
SU(6), 369 
SU(6, 6), 57 
SU(N), 741-46 
SU(N/M), 689, 669-80 
Subquarks, 627 


Index 


Substitution rule, 164, 170, 176 
Subtraction point, 223, 233-35 
Superconductivity, 360-61 
Superconformal group, 719 
Superfield, 669-80 
Supergravity, 692-96 
Super groups, 688-92 
Superpotential, 678-79 
Super-renormalizable theories, 216-17 
Super space, 669-80 
Supersymmetry 15, 34-5, 663-95, 717-729 
Super- Yang-Mills theory 217 
N = 1, 217, 675, 668 
N =4, 16, 217, 685 
nonrenormalization theorem, 686-88 
Susceptibility, 573 
Symmetry breaking, 321-57 
Symmetry restoration, 359-60 


Tachyon, 56, 323, 707, 714, 738 
Tangent space, 640-41 

T neutrino, 11, 335, 384 
Technicolor, 627 

Temporal gauge, 107 

Tensor representation, 40, 743 
Theta vacua, 402, 559-65 
Thompson cross section, 170, 201 
*t Hooft gauge, 345-48 

*t Hooft-Polyakov monopole, 543-45 
Thrust, 383 

Time ordering, 76, 149 
Time-reversal invariance, 119-20 
Topological charge, 533-34, 537-38 
Top quark, 11, 384 

Transfer matrix , 575-84 
Transverse fields, 109-11, 308-11 
Tree diagrams, 163, 284-85 
Triangle anomaly, 414-20 
Tunneling, 545-53 

Twist, 491 

Type I strings, 721-22 

Type IA,B strings, 721-22 


U(1), 39-42, 101 

U(1) problem, 559-63 
Ultraviolet fixed point, 481 
Unitarity, 34, 69, 130, 199-202 
Unitary gauge, 107, 344 
Universality, 574, 588 


V — A theory, 333 

Vacuum, 67 

Vacuum polarization, 185-89, 228 
Veneziano formula, 714 


Index 


Vertex models, 581 
Vierbein, 640-42, 644 
Virasoro algebra, 708-9 


Ward-Takahashi identity, 215, 232, 243-47, 256, 
407-10, 665 

W boson, 9, 11-12, 337-38 

Wave function renormalization, 142, 224, 233 

Weak interactions, 333-35 

Weinberg angle, 337, 618, 625 

Weinberg—Salam model, 11, 335-57 

Weinberg’s theorem, 212, 488 

Wess—Zumino gauge, 676 

Wess—Zumino model, 667, 675 

Weyl fermions, 93-94 

Wey! representation, 49, 669, 753 

Wey] tensor, 659 

Wick’s theorem, 151-54 

Wilson lattice, 505-25 


785 


Wilson loop, 513 

Winding number, 537, 544-45, 558-59 
Witten action, 732-35 

WKB approximation, 545-53 

World sheet, 703-4 


XYZ model, 580 


Yang—Baxter relation, 581-84 

Yang-Mills field, 295-313, 554, 652-56, 707 
Young tableaux, 612, 744-46 

Yukawa theory, 6, 9, 215 


Zp, 526 

Zn, 521 

Z boson, 12, 337-38 

Zero point energy, 67-68, 87, 696 
Zimmerman’s solution, 444-49 


ee ee eee ee a -» eee a Eee Le 


‘““A massive work covering a grand variety of traditional subjects... [It] has done 
all of us the favor of including, in addition, a good number of current research 
topics that are normally dealt with only in conference proceedings or special- 
ized texts of their own. This is a lot of book.” 

International Journal of Quantum Chemistry 


This lucid introduction to modern quantum field theory fills the need for a 
text that details the basics of field theory as well as the practical and theoreti- 
cal implications of quantum chromodynamics (QCD) and the Standard 
Model. The first part of the book lays a solid foundation by presenting canon- 
ical quantization, Feynman rules and scattering matrices, and renormaliza- 
tion. The book proceeds to a comprehensive presentation of the Standard 
Model, path integrals, gauge theory, spontaneous symmetry breaking, the 
renormalization group, and BPHZ quantization. Finally, the book concludes 
with a discussion of more advanced topics, such as critical phenomena, lat- 
tice gauge theory, instantons, grand unified theories, supersymmetry, quan- 
tum gravity, supergravity, and superstrings. Over 260 helpful exercises are 
included. 
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